Porosity Classification and Clustering in
Ti-6Al-4V alloy parts fabricated by
Additive Manufacturing
This thesis is submitted to University College Dublin in fulfilment of
the requirements for the degree of Master of Engineering Science
in Mechanical and Materials Engineering.
Conor Sheehan
National University of Ireland
University College Dublin
College of Engineering and Architecture
School of Mechanical and Materials Engineering
Prof. Denis D. Dowling Prof. Kenneth T. Stanton
Principle supervisor Head of school
ABSTRACT
The additive manufacturing of metal alloys by laser powder bed fusion (LPBF),
requires the optimisation of powder melting conditions, in order to achieve the desired
material properties. Reducing the level of porosity incorporated in the alloy is a key
consideration during powder melting. In addition to the overall level of porosity a further
consideration is its type (i.e., Gas, Keyhole and Lack of Fusion), as these in-turn can
also influence the material properties of the printed alloy. This thesis is focused on the
printing of Ti-6Al-4V parts using the Renishaw RenAM 500M, production scale additive
manufacturing system. During part printing argon gas is passed over the print bed in
order to provide a non-oxidising atmosphere, as well as to remove process by-
products. This study includes an evaluation of the effect of varying the argon gas flow
rate on the porosity of the resulting printed alloy. During printing the melting of the alloy
part was monitored using optical emission spectroscopy (InfiniAM system) and a
correlation was obtained between the in-process emission data obtained from the
meltpool and the level of porosity obtained in the printed alloy parts. The latter was
determined using both micro-X-ray-computed tomography (XCT) and optical
microscopy. The latter examination was based on cross sectional studies of the printed
alloy samples. Significantly increased porosity in the alloy was observed when lower
argon gas flow rates were used, during printing.
Unsupervised machine learning algorithms were applied to data obtained from both
the XCT and microscopy examination, to determine if this data could be used to help
determine the effect of gas-flow on the type of porosity generated in the printed alloy
samples. Based on the clustering of XCT data, a high argon gas flow rate exhibited
the largest proportion of pores identified in the build being labelled as ‘gas’ pores, in
comparison with those printed at lower Ar flow rates. The ‘Gaussian Mixture Model’
(GMM), unsupervised machine learning technique was found to be effective in
clustering of porosity data. This machine learning technique facilitated the
quantification of the relative porosity type distribution, within parts printed at the three
argon gas flow levels investigated.
The use of supervised classification ML algorithms were also applied to a set of
manually labelled porosity data obtained using the cross-sectional microscope images
(using ImageJ). The evaluated algorithms include K-Nearest Neighbour, Logistic
Regression, Naïve Bayes, Multi-Layer Perceptron, Decision Tree Classification,
Support Vector Machine, Gradient Boosting, and Extreme Gradient Boosting. This
multiclass classification was implemented in two separate binary classification
problems, firstly as gas / non-gas, and subsequently the non-gas class were classified
as Lack of Fusion / Keyhole. Through comparison of both Gas Flow dataset cross
validation scores, as well as final Test Set Scores, the classification obtained for gas /
non-gas was determined to be the more accurate, compared with that obtained for
Keyhole / Lack of Fusion classification. It was also found that of the eight supervised
ML approaches evaluated, the Extreme Gradient Boosting Model achieved the most
accurate porosity classification.
TABLE OF CONTENTS:
CONTENTS
ABSTRACT ............................................................................................................................. 2
TABLE OF CONTENTS: ............................................................................................................ 3
LIST OF FIGURES: ................................................................................................................... 6
LIST OF TABLES: .................................................................................................................... 9
ACKNOWLEDGEMENT ............................................................................................................ 11
CHAPTER 1: GENERAL INTRODUCTION: ................................................................................. 13
1.1 : INTRODUCTION: .................................................................................................... 13
1.2 : MOTIVATION:........................................................................................................ 13
1.3 : PROBLEMS AND OBJECTIVES: ............................................................................... 13
1.4: THESIS FORMAT:........................................................................................................ 15
CHAPTER 2: LITERATURE REVIEW: ........................................................................................ 17
2.1 OVERVIEW .................................................................................................................. 18
2.2 Additive Manufacturing ........................................................................................... 18
2.2.2 LPBF PROCESS PARAMETERS .............................................................................. 21
2.3 DEFECTS AND POROSITY ............................................................................................. 23
2.3.4 POROSITY IMPACTS ON MECHANICAL PERFORMANCE ............................................. 26
2.3.5 CHAMBER GAS FLOW IMPACT ON POROSITY ........................................................... 27
2.4 MACHINE LEARNING IN ADDITIVE MANUFACTURING ....................................................... 28
2.4.1 ML APPLICATIONS IN AM ...................................................................................... 30
2.4.2 UNSUPERVISED CLUSTERING................................................................................. 33
2.4.3 SUPERVISED CLASSIFICATION ............................................................................... 36
2.5 LITERATURE REVIEW SUMMARY ................................................................................... 41
CHAPTER 3: EXPERIMENTAL METHODS AND MATERIALS ........................................................ 43
3.1 INTRODUCTION ............................................................................................................ 43
3.2 MATERIALS: TI-6AL-4V ............................................................................................... 43
3.3 RENISHAW RENAM 500M AND INFINIAM IN- PROCESS MONITORING SYSTEM ................ 43
3.3.1 INFINIAM SPECTRAL ............................................................................................. 45
3.4 X-RAY COMPUTED TOMOGRAPHY ................................................................................ 45
3.5 OPTICAL MICROSCOPY EXAMINATION ........................................................................... 46
CHAPTER 4: POROSITY ANALYSIS RESULTS .......................................................................... 49
4.1 INTRODUCTION ............................................................................................................ 49
4.2 COMPARISON OF POROSITY OBTAINED THROUGH XCT AND MICROSCOPY ...................... 49
4.3 CONCLUSIONS ............................................................................................................ 52
CHAPTER 5: UNSUPERVISED CLUSTERING AND CLUSTER NUMBER EVALUATION ..................... 55
5.1: INTRODUCTION ........................................................................................................... 55
5.2: UNSUPERVISED CLUSTERING ...................................................................................... 55
5.3: CLUSTER NUMBER EVALUATION ................................................................................. 57
5.3.1: HIERARCHICAL CLUSTERING AND DENDROGRAM APPROACH ................................. 58
5.4.2: HIERARCHICAL CLUSTERING RESULTS ................................................................. 58
5.3.2: ELBOW PLOT APPROACH ..................................................................................... 60
5.4.1: ELBOW PLOT RESULTS ........................................................................................ 62
5.3.3: BAYESIAN INFORMATION CRITERION APPROACH ................................................... 63
5.4.3: BIC SCORE RESULTS .......................................................................................... 64
5.3.4: SILHOUETTE SCORE APPROACH ........................................................................... 65
5.4.4: SILHOUETTE SCORE RESULTS .............................................................................. 66
5.4: CLUSTER NUMBER RESULTS SUMMARY ...................................................................... 67
5.5: UNSUPERVISED CLUSTERING USING GMM RESULTS .................................................... 67
5.5.1: GMM PLOT FOR XCT DATA ................................................................................. 68
5.5.2: GMM PLOT FOR MICROSCOPY DATA .................................................................... 71
5.6: DISCUSSIONS AND CONCLUSIONS ............................................................................... 76
CHAPTER 6: SUPERVISED CLASSIFICATION AND EVALUATION ................................................. 78
6.1: INTRODUCTION ........................................................................................................... 78
6.2: SUPERVISED CLASSIFICATION ..................................................................................... 78
Dataset Description ...................................................................................................... 78
Classification Implementation....................................................................................... 79
6.3: CLASSIFICATION EVALUATION .................................................................................... 80
6.3.1: GENERAL ACCURACY .......................................................................................... 81
6.3.2: AREA UNDER CURVE (AUC) SCORE AND RECEIVER OPERATOR CHARACTERISTIC
(ROC) CURVE ............................................................................................................... 81
6.3.3: RECALL AND PRECISION SCORES ......................................................................... 83
6.3.4: F1 SCORES ......................................................................................................... 83
6.4: GAS/NON-GAS 5-FOLD CROSS VALIDATION ................................................................ 84
6.5: GAS/NON-GAS TEST SET EVALUATION ....................................................................... 86
6.6: KEYHOLE/LACK OF FUSION 5-FOLD CROSS VALIDATION .............................................. 85
6.7: KEYHOLE/LACK OF FUSION TEST SET EVALUATION ..................................................... 86
6.8: DISCUSSION AND CONCLUSIONS ................................................................................. 89
CHAPTER 7: CONCLUSIONS................................................................................................... 93
7.1 FUTURE WORK ............................................................................................................ 97
APPENDIX ............................................................................................................................ 78
BIC SCORES .................................................................................................................... 98
REFERENCES ..................................................................................................................... 100
LIST OF FIGURES:
Figure 1: GE Aviation fuel injector nozzle for Superalloys. Metal AM facilitates the
production of more lightweight components in a single print process [10]................ 19
Figure 2: Cross Section Schematic of a Laser Powder Bed Fusion process at
machine scale (left) and powder scale (right) [45] .................................................... 20
Figure 3: Schematic depicting the laser beam and powder bed interaction [53] ...... 21
Figure 4: Process Parameters in LPBF (also known as SLM) with VED parameters
highlighted in yellow [63] .......................................................................................... 22
Figure 5: Differing forms of Porosity formed at the VED values shown in LPBF
produced components. The insert images help to demonstrate the types of porosity
generated i.e. LOF (low VED) and KH (high VED). Minimised porosity in the
highlighted light blue region achieved through minimising both KH pores, and LOF
pores [63]. ................................................................................................................ 23
Figure 6: Microstructural pore type morphologies, generated with common (upward)
build direction. Lack of Fusion (a) Gas (b) and Keyhole (c) [44] ............................... 25
Figure 7: Microscopy image of LOF Porosity, with characteristic un - melted powder
present within pore, as well as irregular shape and relatively large size .................. 25
Figure 8: Porosity Regimes for both Keyhole and Lack of Fusion porosity in metal
AM components with respect to increasing scan speed [48] .................................... 26
Figure 9: Computational ANSYS Simulation of Stress Simulation showing the Lack of
Fusion pore acting as a stress concentrator [26]. ..................................................... 27
Figure 10: Schematic of Gas Flow in general LPBF process chamber, depicting gas
flow powder bed interactions [65] ............................................................................. 28
Figure 11: Taxonomy of ML applications within the AM industry [33]. ...................... 30
Figure 12:Process Structure Property (PSP) relation chain schematic in AM [33].
Text within boxes represents available data, while arrows and bold text represent
examples of potential ML applications input and output data sources. .................... 31
Figure 13: Schematic outlining input parameters for CNN model applied to LPBF
production of Ti-6Al-4V. Input parameters include Thermal History from in-situ PM
data (a) as well as process parameters indicated in (b). Output predicted distortion
predicted in (c). CAMP-BD represent Convolutional and Artificial CNN applied to Big
Data [44] ................................................................................................................... 32
Figure 14: Exclusive (left) and Overlapping Clustering (right) example. In exclusive
clustering, datapoints belong to one cluster only. In overlapping clustering, points
may belong to two unique clusters. In probabilistic clustering, each datapoint is
assigned a probability of belonging to a cluster, and then assigned to the cluster of
highest probability [49] ............................................................................................. 33
Figure 15: Procedure of implementation of Self Organising Map for Acoustic
Emission analysis of Fused Filament Fabrication produced ABS by Wu et al. Feature
extraction performed on in process acoustic emission signal, which is then used an
input to SOM algorithm. Subsequently anomalous signals were detected. [59]. ...... 35
Figure 16: An example of Binary Classification, where only two distinct clusters exist
within the dataset [63] .............................................................................................. 37
Figure 17: An example of Multiclass Classification, where multiple clusters exist
within the dataset [62] ............................................................................................. 37
Figure 18: A case requiring Imbalanced Classification, as far more cases of the blue
‘0’ class exist in comparison to the orange ‘1’ class [62] .......................................... 38
Figure 19: Renishaw RenAM 500M LPBF System [23] ............................................ 43
Figure 20: Photograph of a build plate, showing the cylindrical printed alloy parts.
The 17 parts chosen for porosity analysis are marked by orange squares. ............. 44
Figure 21: Schematic of the InfiniAM Spectral System [26] ...................................... 45
Figure 22 An example of ImageJ porosity thresholding applied to a micrograph
image (Left) of a Ti-6Al-4V produced on the Renishaw RenAM 500M. Based upon
grayscale values of each pixel, each pixel in the micrograph is labelled as ‘Porous’ or
‘Non-Porous’ (Right), and geometric features relating to each individual pore are
calculated using the porosity analysis module available in ImageJ. The thresholded
image contains porous point (In Black) and solid material (In White). Note that while
the Bakelite mount surrounding the solid material is also black, this area was not
included in porosity analysis. .................................................................................... 47
Figure 23 VGStudios XCT Reconstruction of gas flow cylinder sample produced at
same build plate location from high gas flow (36 per hour) (left); medium gas flow
(31 per hour) (centre); and low gas flow (26 per hour) (right) ......................... 49
Figure 24 Distribution of XCT Porosity across the build plates, based on
measurements obtained from 17 samples examined for each of the three gas flow
rate builds obtained from 17 samples examined for each of the three gas flow rate
builds ........................................................................................................................ 50
Figure 25: An example Cross Section Image taken from the Low Gas Flow build,
demonstrating a considerable level of lack of fusion porosity ................................... 51
Figure 26: Comparison between images of Ti-6Al-4V obtained using XCT (a), and
from cross sectional microscopy (magnification 275X) (b), for a sample printed at the
low gas flow rate. Note the large ‘key-shape’ LOF pore, with an approximate length
of 800 μm in the y-direction at point 1 on the XCT cross section (a), with the
corresponding pore in the microscope image (b). .................................................... 51
Figure 27: Covariance Constraint parameters for GMM plotted on open source ‘Iris’
data. Training datapoints marked with crosses, while test datapoints marked with
dots. The GMM model was trained using the training dataset, while evaluated then
upon the test set [121]. ............................................................................................. 57
Figure 28: Two-dimensional feature space, with 6 points and resultant dendrogram.
As can be seen from the dendrogram, no clear cluster number is evident. The largest
vertical line (green) can be seen to intersect any of 4, 3 or 2 vertical lines depending
on where the horizontal (red) line is placed [123] ..................................................... 58
Figure 29: Hierarchical Clustering Dendrogram plots for (a) Low Gas Flow Rate
(26 per hour) (b) Medium Gas Flow Rate (31 per hour) (c) High Gas Flow Rate
(36 per hour) ....................................................................................................... 60
Figure 30: Elbow plot comparison, with definite elbow point present (left), and no
elbow point present (right). This lack of clear elbow in plot indicates that a new
cluster number evaluation method required [48] ...................................................... 61
Figure 31: Elbow Plots for XCT analysis of LPBF produced Ti-6Al-4V cylinders
produced at: (a) Low Gas Flow Rate (26 per hour) (b) Medium Gas Flow Rate
(31 per hour) (c) High Gas Flow Rate (36 per hour) ....................................... 63
Figure 32: Low Gas Flow XCT data BIC values vs Cluster number for GMM
clustering .................................................................................................................. 64
Figure 33: Depiction of values used in silhouette score calculation [99]................... 65
Figure 34: Silhouette scores for GMM plot with full covariance per cluster number for
each data source. Each Data source containing three gas flow rates. Note that in (c)
although the silhouette scores for a cluster number of 2 and 3 appear identical in the
High and Medium gas flow, the value at a cluster number of 2 is marginally larger. 66
Figure 35: GMM Results for XCT Analysis data from (a) Low (b) Medium (c) High
Gas Flows. Total of 95489 Pores ............................................................................. 69
Figure 36: GMM Results for Optical Microscopy Analysis data from (a) Low (b)
Medium (c) High Gas Flows at Magnification of 275X. Total of 134987 Pores ......... 72
Figure 37: GMM Results for Optical Microscopy Analysis data from (a) Low (b)
Medium (c) High Gas Flows at Magnification of 624X. Total of 249,826 pores ........ 74
Figure 38: Examples cross section images obtained from the two sample sets
referred to in this chapter with the Gas Flow sample shown on the left and that
obtained from the Test Set sample set, shown on the right ...................................... 79
Figure 39: 5-Fold Cross Validation Example showing how the 20% of the total data
(indicated in grey) acting as a test set, moves through each portion of the dataset
[100] ......................................................................................................................... 80
Figure 40: Example Confusion Matrix for a binary classification problem [101] ....... 81
Figure 41: ROC Curves for 3 example models compared with flipping a coin. Better
models are shown approaching the behaviour of a perfect classifier [101] .............. 82
Figure 42: Gas/ Non-Gas Pore classification ROC Curve Test set. A higher ROC
curve indicates a larger AUC score and consequently a more accurate classification.
................................................................................................................................. 87
Figure 43: Lack of Fusion Pore classification ROC Curve Test set. A higher ROC
curve indicates a larger AUC score and consequently a more accurate classification
................................................................................................................................. 89
LIST OF TABLES:
Table 1: Examples of eight classification algorithms used in AM processes and their specific
uses. ................................................................................................................................... 39
Table 2: Chemical Composition (%wt) of Ti-6Al-4V ............................................................. 43
Table 3: Processing Parameters used for the printing of the Ti-6Al-4V cylindrical parts. Note
the printing studies were carried out three times, within which the Ar shielding gas flow was
varied at three levels, as detailed in the text. ....................................................................... 44
Table 4: Comparison between the porosity measurements obtained based XCT and
Micrograph ImageJ Porosity Measurements. ...................................................................... 52
Table 5: Cluster Average Feature Values across each Gas Flow for XCT data. Dark rows
represent Irregular clusters, and light rows represent Regular clusters ............................... 70
Table 6: Pore Type proportion at each gas flow in XCT data............................................... 70
Table 7: Cluster Average Feature Values across each Gas Flow obtained from microscopy
images (Magnification = 275X) ............................................................................................ 72
Table 8: Pore Type proportion at each gas flow based on Optical Microscopy data
(M=275X). Total of 55 Samples........................................................................................... 73
Table 9: Cluster Average Feature Values across each Gas Flow for Optical Microscopy
(Magnification = 624X) ........................................................................................................ 75
Table 10: Pore Type proportion at each gas flow in micrograph data (M=624X). Total of 9
Samples .............................................................................................................................. 76
Table 11: Performance metrics for classifier algorithms applied to Gas Flow Set for
Gas/Non-Gas Classification. Average Score in each metric presented for comparative
purposes. The results are provided for the eight algorithms studied were: K-Nearest
Neighbour (KNN); Decision Tree Classifiers (DTC); Naïve Bayes (NB); Support Vector
Machine (SVM); Logistic Regression (LR); Multi-Layer Perceptron (MLP); Extreme Gradient
Boosted Classification (XGB) and Gradient Boosted Classification (GB) ............................. 85
Table 12: Performance metrics for classifier algorithms applied to Gas Flow set for
Keyhole/Lack of Fusion Classification. Average Score in each metric presented for
comparative purposes. ........................................................................................................ 85
Table 13: Performance metrics for classifier algorithms applied to Test Set for Gas/Non-Gas
Classification. Average Score in each metric presented for comparative purposes. ............ 86
Table 14: Performance metrics for classifier algorithms applied to test set for Keyhole/Lack
of Fusion Classification. Average Score in each metric presented for comparative purposes.
........................................................................................................................................... 88
Table 15: Average metric scores for all eight classification algorithms in both 5-fold cross
validation experiments upon the Gas Flow dataset. The values for the Gas/Non-Gas
classification are noticeably larger....................................................................................... 91
Table 16: Total proportion of both Regular and Irregular clusters at each gas flow within the
Gas Flow dataset, based on GMM clustering ...................................................................... 95
Table 17: Average metric scores for all eight classification algorithms in both 5-fold cross
validation experiments upon the Gas Flow dataset. The values for the Gas/Non-Gas
classification are noticeably larger....................................................................................... 95
DECLARATION OF AUTHENTICITY
I hereby declare that this thesis is my own work and that it has not been submitted
anywhere for any award. I further declare that where other sources of information
have been used, they have been acknowledged by means of a comprehensive list of
references
ACKNOWLEDGEMENT
Firstly, I would like to thank my supervisor Professor Denis Dowling for giving me an
opportunity to pursue this MSc. I am grateful for his support, guidance, and expertise
provided throughout my studies.
I would like to thank all staff and students at UCD and I-Form with whom I have worked
with over these two years. I would like to thank Aoife Doyle, who was a great help
throughout the duration of this research. I would also like to thank Joan Kelly from
DCU for her help throughout the duration of this project.
Finally, I would like to thank my family, especially my father, mother, brothers Eoghan
and Niall, and sister Lucie whose support I am eternally grateful for.
CHAPTER 1: GENERAL INTRODUCTION
CHAPTER 1: GENERAL INTRODUCTION:
1.1 : INTRODUCTION:
Additive manufacturing (AM) is a materials processing technology with application in
a range of industry sectors, particularly for aerospace and medical device fabrication.
It is used both in the commercial production of final products, as well as in the rapid
production of prototypes. One of the most widely applied AM processes is Laser
Powder Bed Fusion (LPBF), which is a metal additive manufacturing technology used
in the production of complex metallic structures. Many end users of LPBF parts are for
high value added, precision parts used in components where failure would be critical.
As a result of this, understanding the causes of defects such as porosity, during LPBF
processing is of critical importance to AM equipment manufacturers. A deep
understanding of the causes of porosity, as well as an understanding of the different
forms of porosity is critical to achieve enhanced AM processing control. In this study
the use of machine learning techniques is investigated, as an approach to obtain a
quantitative estimation of the forms of porosity present in Ti-6Al-4V parts printed using
the LPBF process. This study evaluates both supervised and unsupervised machine
learning approaches, to evaluate and cluster porosity data obtained from the
examination of printed alloy parts.
1.2 : MOTIVATION:
LPBF is an area of increasing interest and is becoming more widely applied in the
production of precision engineering components such as orthopaedic devices used for
both hip and spinal devices. In order to assist the wider adoption of this technology
however a deeper understanding of the source of part defects, particularly porosity,
are important, due to their impact on the material and mechanical properties of the
printed alloy parts. To this end, an understanding of the forms of porosity present within
LPBF produced components is important, as the different types of porosity (lack of
fusion, keyhole and gas), each will have a different impact on a part’s mechanical
performance.
This MSc study aims to enhance the understanding of the types and distribution of
porosity present within alloy parts printed by LPBF. An assessment of porosity was
carried out using both X-Ray Computed Tomography (XCT) and Optical Microscopy.
The evaluation of the resulting porosity using statistical machine learning models was
investigated, using both unsupervised and supervised approaches. The aim is to gain
an understanding of pore morphology and its distribution in LPBF produced
components. To this end the application of statistical machine learning methods to
various forms of porosity analysis data will be implemented.
1.3 : PROBLEMS AND OBJECTIVES:
In additive manufacturing, ‘porosity’ can sometimes have a positive impact on device
performance, such as the large-scale porosity produced in biomedical lattice
structures. Other non-intentional pores are associated with microstructural pores
generated during alloy printing. These can be subdivided into three groups, namely
gas, keyhole, and lack of fusion porosity. These three microstructural forms of porosity
form due to a variety of different reasons during the LPBF process and have different
effects on the mechanical performance of the component. As such, it is of vital
importance that the distribution of these three classes of microstructural porosity are
understood, as well as the processing conditions which lead to their formation, within
the LPBF process. These three forms of microstructural porosity will be referred to as
‘porosity’ throughout the duration of this thesis. For many parts obtained using additive
manufacturing, post process treatments particularly hot isostatic pressing (HIP
Treatments), are routinely used to remove this type of porosity.
Within the LPBF process, chamber purging using Argon (Ar) gas is required in order
to prevent metal oxidation during powder melting. The flowing Ar gas also helps in the
removal of process by products, which are generated as the laser interacts with the
powder particles. This study firstly investigates the effect of Ar gas flow rate on the
porosity of the resulting alloy parts. To-date, there have been no reports in the
literature on the effect of systematically altering this inert gas flow rate, on the porosity
or other properties of printed alloy parts. This effect was investigated in this study by
printing Ti-6Al-4V cylinders in a production scale AM system (Renishaw RenAM
500M), at three different gas flow rates - 26 per hour (Low), 31 per hour
(Medium), and 36 per hour (High). Statistical machine learning algorithms were
applied to porosity data obtained from X-Ray Computed Tomography (CT), as well as
ImageJ Image analysis data which was obtained based on microscopy analysis of
cross sections of the printed cylinders. The overall objectives were to determine:
Does alteration of the Ar purge gas flow result in a change in the overall porosity
of the Ti-6Al-4V parts?
Does an increasing purging gas flow rate cause a proportionate increase in one
or more forms of porosity and if so, why?
Does this increase in gas flow cause an increase in the number of ‘types or
clusters present within Ti-6Al-4V components?
Does the porosity analysis technique (i.e. CT scanning or microscopy
examination), influence the sensitivity / accuracy of the porosity analysis?
To obtain a comparison between the use of a range of unsupervised and
supervised machine learning algorithms for the determination of the type and
overall level of porosity within printed Ti-6Al-4V alloy parts. To determine if
these machine learning approaches accurate for the evaluation of porosity data
based on CT scans of alloy parts.
The aim of this thesis is to enhance the understanding of Ar gas flow rate during the
LPBF process, on the morphology and porosity of alloy parts printed. Three datasets
were used to evaluate the porosity type distribution with the Ti-6Al-4V alloy parts,
these were XCT analysis data, along with cross-sectional microscopy studies
(ImageJ), at two different magnifications. The advantages and limitations of the use of
supervised and unsupervised machine learning for the analysis of the alloy porosity
data were also evaluated.
In the case of the unsupervised machine learning, an approach known as clustering
is implemented. Clustering is a form of machine learning, whereby the model groups
the data into ‘clusters’ or groups, where each item in a cluster is more like items within
that cluster than items within other clusters. The objective is to determine if the use of
clustering approach can be used to evaluate the type and level of porosity present, for
cylinders printed at the three different Ar gas flow rates. This unsupervised machine
learning approach requires no user input, other than the number of expected clusters
within the dataset. This number can also be calculated through statistical means.
In the case of the supervised machine learning algorithms, the initial step is the
generation of a training dataset which consisted of a manually labelled subset of the
optical microscopy ImageJ geometric porosity data. The use of supervised
classification facilitates some user input into the performance of the algorithms. The
use of a manually labelled subset will enable the algorithms to ‘learn’ what the different
forms of porosity look like geometrically, and the use of classification algorithms with
labels enables the accuracy, among other metrics of the models to be assessed. Eight
different classification algorithms are assessed, and their performance compared. The
models to be used were K-Nearest Neighbour, Decision Tree Classification, Naïve
Bayes, Support Vector Machine, Multi-Layer Perceptron (Neural Network), Logistic
Regression, Gradient Boosted Classifier and Extreme Gradient Boosted Classifier.
These supervised approaches were selected as they are some of the most commonly
utilised supervised classification algorithms.
1.4: THESIS FORMAT:
This thesis is presented in six chapters: Introduction; Literature Review; Experimental
Methods and Materials; Cluster Number Evaluation and Unsupervised Machine
Learning; Supervised Machine Learning and Classification Evaluation; Conclusions.
Firstly, Chapter 1 introduces this MSc thesis, outlining the research questions and
motivations behind this thesis, as well as a general overview to the work undertaken
as part of this project.
Chapter 2 entails the literature review carried out as part of this study. The aim of this
review is to assess the current published literature and determine gaps within the
literature relevant to this project. This literature review is presented in two separate
sections. Firstly, a background on AM is presented, specifically laser powder bed
fusion, and the effect of process parameters including gas flow on porosity are
discussed. The effect of porosity on mechanical performance is also presented.
Secondly, a section on Machine Learning in AM is presented, with examples described
of previous applications of both Supervised, and Unsupervised Machine Learning
approaches. A discussion on the many applications and uses of Machine Learning
within Additive Manufacturing is also presented.
A discussion on the Experimental methods and Materials is presented in Chapter 3.
This section provides background on the LPBF printer used in this study, the Renishaw
RenAM 500M, as well as it’s associated in-process optical emission monitoring system
known as InfiniAM Spectral. Information on the alloy powder used, XCT Scanning
evaluation, cross-sectional evaluation and ImageJ image analysis are also included in
this section.
Chapter 4 discusses the comparison between the use of CT scanning and cross-
sectional microscopy for the evaluation of alloy porosity. The results of the porosity
measurements obtained using these techniques are presented and discussed.
Chapter 5 introduces unsupervised machine learning and discusses the various forms
of clustering investigated, in particular the Gaussian Mixture Model. The various
statistical methods used to determine the optimum cluster number are first described,
followed by the results of their implementation. Finally, the application of the Gaussian
Mixture Model to both XCT data, as well as micrograph data at two magnifications is
presented.
Chapter 6 provides an overview of the supervised machine learning approach,
followed by a discussion of the various analysis scores used in assessing a classifiers
performance. Two datasets are used in this study, both of which were obtained from
the micrograph analysis of Ti-6Al-4V. One set is referred to as the ‘Training Set’, while
the other dataset is known as the ‘Test Set’. A brief discussion of the two datasets is
presented. Finally, the assessment of the performance of the eight classifiers are
evaluated when applied to the two datasets.
Chapter 7 summarises the conclusions from this thesis. It includes a critical
assessment of the application of both unsupervised clustering techniques, as well as
supervised classification techniques when applied to the porosity data. This chapter
also contains a possible future works section.
Appendix A depicts the BIC Score graphs that were used in chapter 5 in an attempt to
determine the cluster number in each of the micrograph and XCT datasets. As each
micrograph dataset and the XCT dataset contained three gas flow rates, the BIC score
graphs are shown in a three-by-three grid.
CHAPTER 2: LITERATURE REVIEW:
CHAPTER 2: LITERATURE REVIEW:
2.1 OVERVIEW
This review firstly provides an overview of the importance of Additive Manufacturing,
with a particular focus on the Laser Powder Bed Fusion (LPBF) process, which is the
focus of this thesis. The effect of print defects, in particular porosity, are discussed,
highlighting how different types of porosity can impact differently on the performance
of the printed parts. The use of Machine Learning (ML) is then introduced and an
overview of this data analysis technique in process understanding is outlined, with a
particular emphasis on the use of unsupervised clustering techniques, as well as
supervised classification techniques. The application of ML clustering to porosity data
is then outlined, given its potential to categorise the type and level of porosity present.
2.2 Additive Manufacturing
Additive Manufacturing (AM) involves the printing of 3D computer models, printed
layer by layer until the final full 3-dimensional object has been manufactured [1]. These
3D computer models are obtained from Standard Triangle Language (.stl) files from
CAD software. The models may be constructed directly in CAD or imported into CAD
from topographical scanning methods such as Computed Tomography (CT) [2] or
Magnetic Resonance Imaging (MRI) [3]. The number of applications of AM is growing
rapidly as this printing technology becomes more mature.
There are a wide range of applications of AM in industry, in particular for geometrically
complex parts and individualised components, for which conventional manufacturing
approaches are both technically difficult as well as often very expensive. The medical
device sector is one example where the ability to produce rapid prototypes, high quality
bone implants, and accurate models of patients damaged bones for analysis are areas
unique to AM in this field [4]. AM printing now allows doctors to scan and build a model
of patient’s defective bone structure, which can provide them with a better idea of what
to expect before surgery and plan a more robust procedure saving time and cost, as
well as potentially improving the result a required surgery [4]. Bone implants also are
an area where AM has far exceeded the capabilities of other manufacturing methods.
AM methods allow for implants virtually identical in shape to the original bone, and due
to the ability of AM to produce incredibly precise small-scale geometries,
manufacturers can create lattice structures with controlled porosity in order to permit
osteoconductive bone ingrowth, an ability unique to AM [5]. The mechanical strength
obtained for the implanted devices have been reported to be up to three to five times
higher than implants produced through other manufacturing methods, while the
possibility of inflammation post-surgery caused by micro debris is reduced [6]. AM is
also a very beneficial technology for dentists, as it allows them to build a model of their
patient’s mouths, as well as to print dental implants [7].
AM also facilitates the production of lightweight components, which find application,
for example, in both the automotive and aerospace industries. For both industries, the
ideal component is as lightweight as can possible, exhibits similar or better mechanical
performance to conventional parts and of course does not compromise the safety of
the part [8]. Complex cross-sectional geometries such as the honeycomb cell are
feasible with AM technology, as well as a variety of other geometries that contain
cavities [8]. Part consolidation is also achieved with AM, enabling the production of
several components at once, reducing the amount of tooling, inspection required [9].
Figure 1: GE Aviation fuel injector nozzle for Superalloys. Metal AM facilitates the
production of more lightweight components in a single print process [10]
One of the advantages of AM is that it facilitates a considerably reduction in process
waste compared to traditional manufacturing methods [12]. Only material needed for
the component being printed is added to each layer (excluding supports where
necessary), compared to a process such as milling where material is removed from a
piece larger than the desired product until the desired product has been obtained.
Figure 1 depicts a fuel injector nozzle used in the Leaf engine [10]. For which use of
AM facilitated the production of the component in a single print process thus avoiding
the need to bond parts as well as achieving a significantly lighter final component
AM encompasses a wide range of individual processes, each process varying with
respect to machine technology and materials produced. Consequently, in 2012 the
American Society for Testing and Materials (ASTM) group "ASTM F42 Additive
Manufacturing" defined a set of standards known as Designation: F2792 12A [63] that
classify a range of AM technologies into 7 distinct categories, and of these 2 are
predominantly used in the manufacturing of metal components, namely: Powder Bed
Fusion [13][14] and Directed Energy Deposition [15][16]. In this thesis the focus is on
one of the powder bed fusion technologies called Laser Powder Bed Fusion, for this
reason this is described in detail in the following section.
2.2.1 LASER POWDER BED FUSION
Figure 2: Cross Section Schematic of a Laser Powder Bed Fusion process at machine
scale (left) and powder scale (right) [45]
Laser Powder Bed Fusion (LPBF) is a form of selective laser melting, where a high-
power laser is used to melt select regions of metallic powder together in a layer-by-
layer process [45]. The goal of LPBF is to completely melt the feedstock powder in
order to obtain fully dense parts.
Figure 2 shows a cross section of an LPBF printer, consisting of a powder delivery
system, and an energy delivery system [45]. The powder delivery system is composed
of a powder delivery piston, which when raised supplies the build powder feedstock
material. The spreader then spreads the feedstock evenly across the build plate, which
is located over a fabrication piston. This piston lowers each time a layer is completed.
The energy delivery system is composed of either a laser (used in Selective Laser
Sintering (SLS), or an electron beam (used in Electron Beam Melting (EBM)), and a
scanner system composed of optical mirrors capable of focusing the beam to any point
on the build plate. Many LPBF systems come equipped with in-process monitoring
capabilities in order to facilitate the monitoring of parts in real time with the aim of
reducing the amount of rejected parts, save costs and improve reproducibility. Further
to the inclusion of in process monitoring systems, the implementation of closed loop
feedback systems, which automatically moderate process parameters in order to
improve part quality are of increasing interest in the LPBF industry [18][19].
A variety of in process monitoring systems have been developed for LPBF systems,
including image processing methods, thermal imaging, and melt pool spectroscopy
[20]. Amongst these the optical monitoring systems such as charged-coupled devices
(CCD’s) and complementary metal oxide semiconductor (CMOS) camera-based
systems, are common in industry. An aspect of this form of data monitoring is the non-
contact requirement with the powder bed, versatility and the large amount of data that
can be obtained [19]. Lu et al designed an optical process monitoring system for an
LPBF process, where an optical camera, a set of diode lights and a mirror were used
[21]. The density and tensile strength of the samples were then correlated with
‘features’ obtained from the camera images. This system then allowed mechanical
performance of the system to be calculated based upon the number of features
identified in the images.
An increasingly popular form of process monitoring involves the monitoring of melt
pool emissions and laser monitoring. Renishaw PLC have developed an in process
monitoring system known as the InfiniAM monitors both the laser, and the melt pool
across a wide spectral range [22][23]. The LaserVIEW system enables the intensity of
the laser input to be monitored up to a rate of 2 MHz, which enables any drift or
deviation in behaviour of the laser to be identified. This behaviour along with initial
system calibration information, enables the laser to be monitored over long periods of
time. The InfiniAM MeltVIEW module uses a dual coaxial diode system in order to
monitor emissions produced by the melt pool. A plasma diode records emissions in
the range [300 to 700nm], while the melt view diode monitors IR emissions in the range
[700 to 1700nm]. The Renishaw proprietary software InfiniAM Spectral then converts
this data into 2D and 3D representations as visual feedback for the user. There are
currently limited publications on the use of this process monitoring system, although
several studies by Egan et al., have demonstrated the technology’s ability to detect
larger porosity [39].
2.2.2 LPBF PROCESS PARAMETERS
Figure 3: Schematic depicting the laser beam and powder bed interaction [53]
LPBF contains numerous process parameters, which can influence the properties of
the printed part. Some of these parameters, are depicted in Figure 3, relate to the
beam, the powder, along with the scanning parameters [53]. The properties of the
powders such as morphology, particle size, shape, roughness and chemistry also
effect the performance of the printed part [23]. The scan strategy (describing the
motion of the laser in producing the part) also plays an important role in part property
performance, with scan strategies such as: parallel scanning; spiral scanning;
paintbrush scanning; and chessboard scanning, have all been demonstrated to impact
on mechanical properties [26][27].
A further factor influencing the LPBF process is the use of a controlled atmosphere in
the printing chamber using an inert gas, such as Nitrogen or Argon. This is in order to
maintain Oxygen levels below 500 parts per million [60]. The aim is to prevent the
metal powder from becoming oxidised or degrading during the printing process.
The parameters highlighted in yellow in Figure 4 indicate the process parameters that
have been reported to have the most significant impact on the generation of defects
during LPBF processing [61]. The highlighted process parameters can be related via
the Volumetric Energy Density (VED) equation [62]. VED expresses the amount of
energy imparted by the beam source into the powder bed, per unit volume. Assuming
100% of beam energy is used in melting the powder, this equation gives the amount
of energy used in melting 1 cubic millimetre of powder, as a function of laser power,
hatch distance, layer thickness and scan speed. The Volumetric Energy Density
equation is given in equation 1 [62]. The main benefit of using this equation is that it
can provide an indication of the process window for the LPBF process, where porosity
can be minimised, as seen in Figure 5 [63].
Figure 4: Process Parameters in LPBF (also known as SLM) with VED parameters
highlighted in yellow [63]

 

 Equ. 1
Where:
v - Scan Speed The velocity at which the beam moves in each direction. Given
in units of 
P - Laser Power SLM requires a high-power laser in order to fuse metallic
powders. Power typically ranges from 50W to 250W. Example lasers include
CO2 Lasers, Nd:YAG, Yb:YAG.
h - Hatch Spacing The distance between adjacent beam track centres.
δ - Layer Thickness The layer of thickness of the powder coating. This value
is generally quite thin at 20 - 50 µm [65]. This allows for high surface precision;
however, this comes at the expense of a slow build rate.
The VED equation generally exhibits two regimes, a high volumetric energy density
regime and a low volumetric energy density regime, both of which have been reported
to induce different types of porosity defects within AM alloy components [63]. These
defects are two different forms of porosity, known as Keyhole (KH) porosity in the high
VED regime, and Lack of Fusion (LOF) porosity in the low VED regime. These forms
of porosity are shown in the micrograph images inserted within the Porosity VED plot
in Figure 5:
Figure 5: Differing forms of Porosity formed at the VED values shown in LPBF
produced components. The insert images help to demonstrate the types of porosity
generated i.e. LOF (low VED) and KH (high VED). Minimised porosity in the
highlighted light blue region achieved through minimising both KH pores, and LOF
pores [63].
2.3 DEFECTS (INCLUDING POROSITY)
The microstructure, as well as the mechanical performance of the component
produced through AM can be affected by defects arising from process parameters
during manufacturing [34]. Additional parameters such as part orientation, support
structures, part design and material choice, can further impact the quality of the
produced component also [35]. Examples of defects include porosity, balling, hot
tearing, surface roughness, residual stress, and distortion due to the rapid solidification
of the metal powder in AM [36].
Residual stress can form due to high temperature differences between the molten
metal and cooler substrate [37]. This can lead to severe warping of the part, and in
some cases cracking. In order to avoid this the build plate is often heated, along with
the implementation of certain scan strategies such as spiral [38].
Balling is a defect whereby powdered material forms spheres wider than the parts
layer thickness. This defect can occur due to large oxygen content in the build chamber
(>0.1%). This defect can also form due to reduced scan speed with insufficient power
and can also be rectified through partial remelting of the powder [37][38].
Hot tearing is a defect that forms during the solidification of the melt pool. As this melt
pool solidifies, dendritic grains grow inward from the edge of the melt, and
consequently the centre of the molten material is the final region to solidify. As
opposing grains grow inward, a strain is placed on the molten metal and at the point
of solidification, this can result in cracks forming, known as hot tearing [39].
The average surface roughness of printed parts has been shown to be proportional to
the average feedstock powder particle diameter [40]. High levels of printed part
roughness can occur due to incorrect process parameters, or poor part design.
Insufficient energy being imparted into the powder also produces enhanced roughness
due to powder particles sticking to the surface of the part.
The focus of this research study is on the porosity generated in printed alloy parts and
as a result the following section provides details as to how it is generated, along with
the types of porosity. Porosity can be defined as the absence of material either within
a layer, between layers, or on the external surface of a part [41]. Porosity is generally
classified into three different classes [41]:
Microstructural porosity, where porosity is formed unintentionally often due to
incorrect parameter choices leading to a substantial increase/decrease in VED.
Structural porosity, which are intentional and intended to promote bone
ingrowth in orthopaedic components in the biomedical sector. This type of
porosity is often implemented through the creation of lattice structures in the
part.
Functional porosity, which occurs due to de-binding between layers, and often
results in large, connected porosity.
Functional porosity can be minimised through the Hot Isostatic Pressing (HIP)
treatment of parts after printing [42]. In contrast, structural porosity is intentionally
produced. However, microstructural porosity must be minimised where possible as it
can significantly impact on a part’s mechanical performance [43]. Throughout the rest
of this study, microstructural porosity will be referred to as porosity. This porosity can
be further divided into three subclasses, namely [44]:
Lack of Fusion (LOF) pores
Keyhole (KH) Pores
Gas (G) Pores
Figure 6: Microstructural pore type morphologies, generated with common (upward)
build direction. Lack of Fusion (a) Gas (b) and Keyhole (c) [44]
Each of the three classes has a unique formation mechanism, that can be related the
VED equation for AM processes, described earlier. The porosity regimes for all 3
microstructural pore types are plotted in Figure 8. Generally Keyhole and Lack of
Fusion porosity contribute to a significantly larger volume of porosity in comparison to
Gas porosity, resulting in an ideal process parameter ‘window’ existing between the
two regimes. This is shown in Figure 5, where porosity is minimised between 3.2 and
4.3  for LPBF produced metallic components. [41]
Lack of Fusion porosity occur when the VED is too low, possibly due to insufficient
Laser Power or a Scan Speed that is too fast. This causes insufficient fusion of the
metal powder, as less energy per cubic millimetre is being imparted into the powder
[45]. It results in irregular pore geometries, of varying sizes and sharper edges. LOF
porosity often contains un-melted powder particles within the pore itself, as seen in
Figure 7, a pore from this study.
Keyhole mode porosity however occurs when the VED is too large, because of
increased Laser Power or reduced Scan Speed. Consequently, this creates a vapour
depression within the melt pool, with high liquid flow velocity that closes in on itself as
Figure 7: Microscopy image of LOF
Porosity, with characteristic un -
melted powder present within pore,
as well as irregular shape and
relatively large size
the laser propagates forward [46]. This results in large rounded, but not perfectly
spherical pores, with a characteristic ‘J’ shape.
In the intermediary VED
range (blue region
highlighted in Figure 5)
both Lack of Fusion and
Keyhole type porosities are
minimised, as well as the
total porosity, as a result
[63]. However, minimal
amounts of small scale,
spherical porosity occurs,
which is Gas type porosity,
often known as
metallurgical porosity [47].
This has been attributed to
the gas entrapment of
shielding gas such as
Argon or Nitrogen, porosity
of powder particles, or alloy vapours within the melt pool.
2.3.1 MECHANICAL PERFORMANCE IMPACT OF POROSITY
The impact of porosity on the mechanical performance of LBPF components is
considerable, and there are a number of factors need to be considered when
discussing this impact. The first impact to consider, is the type of porosity, as it has
been demonstrated that keyhole and lack of fusion porosity has a significantly larger
impact on mechanical performance, compared to gas type porosity [49]. Both keyhole
pores and lack of fusion pores especially act as stress concentrators, due to their
irregular geometries. They can also act as initiation points for crack propagation within
the component [49]. The impact of Lack of Fusion porosity is so severe that processing
conditions for most alloys have been experimentally determined, in order to avoid
these pores [50][51][52]. The location of the pore within the component also influences
the specific mechanical property it will impact. It has been shown that pores located at
the surface, sub-surface, interior bulk material, interlayer, and at matrix interface will
affect corrosion, fatigue strength, stiffness, mechanical strength, and fracture
toughness of the component in particular [53].
Figure 8: Porosity Regimes for both Keyhole and Lack of
Fusion porosity in metal AM components with respect to
increasing scan speed [48]
In LPBF components,
porosity located on the
surface, sub-surface and
within the bulk of the material
has a considerable impact on
the strength and stiffness of
the part. Increasing the level
of porosity in LPBF printed
alloys reduces their yield
strength, ultimate strength,
and Young’s modulus under
both tension and loading
conditions [53]. Du Plessis et
al. concluded through a
combination of computational
simulation and experimental
verification that increased
levels of porosity in Ti-6Al-4V
components resulted in reduced yield strength and ductility [54]. Figure 9 shows an
example of a stress simulation from Du Plessis et al research. Regions of increased
irregularity, as well as pore edges, exhibit the most stress, and it has been established
in the literature that regions such as these usually initiate failure within components
[55]. Phutela et al. observed that LPBF produced Ti-6Al-4V samples with an increased
level of porosity exhibited reduced tensile strength, relative to identical samples with
low levels of porosity [55]. Al-Maharma et al demonstrated that more irregular pores
such as the Lack of Fusion pore depicted in Figure 9 caused a significantly more
detrimental impact on a components mechanical properties compared with that
obtained for pores with a regular geometry, such as gas and keyhole type porosity
[56]. It has been demonstrated in the literature that increasing levels of porosity in
metal alloys leads to a reduction in mechanical performance in a series of different
mechanical properties such as tensile strength, Youngs modulus, Poisson ratio
[57][58][59]. Therefore, it is paramount in AM processes to minimise the
microstructural porosity produced.
2.3.2 IMPACT OF LPBF GAS FLOW ON POROSITY
In LPBF processes, printing occurs under an inert atmosphere [60]. The term ‘inert’
refers to chemically inert or chemically inactive. Chemically inactive generally refers
to unreactive chemical gases which are often noble gases, or Nitrogen gas, and these
gases are important in the LPBF process as purging gases as the remove or ‘purge’
the environment of the printer of possibly reactive gases that may contaminate the
process through oxidation of the molten powder [61]. This inert gas flow is crucial to
the process as contamination of the process may result in altering the chemical or
physical properties of the printed component, possibly impacting the mechanical
properties or biocompatibility of the component [62]. It has been established within the
literature that gas glow plays a vital role in the LPBF process [60]. In Electron Beam
Melting (EBM) processes, in contrast to general LPBF processes, a vacuum is
Figure 9: Computational ANSYS Simulation of Stress
Simulation showing the Lack of Fusion pore acting as a
stress concentrator [26].
required in order to avoid interactions between the beam electrons and gaseous
molecules [63].
Figure 10: Schematic of Gas Flow in general LPBF process chamber, depicting gas
flow powder bed interactions [65]
While the inert gas flow prevents oxidation of the molten powder, the flow of gas also
plays an important secondary role in the removal of process emissions, such as
spatter from the molten melt pool, helping to create a clear path for the laser beam
through the powder bed unobstructed by process by products. Figure 10 shows a
general schematic of this process. It is important that the inert gas flow is homogenous,
as it has been demonstrated by Chen et al, that turbulence within the inert gas flow
can lead to undesired surface roughness within the produced component [32].
An aim of this current research is to assess the impact that purging gas flow has on
both the type and level of porosity in Ti-6Al-4V alloy components. Although it has been
well established in the literature that inert purging gas plays a vital role in the quality
of the produced component [62][63][64], a direct study on the impact of varying gas
flow on the overall level of porosity in printed alloy components, as well as a
relationship between inert gas flow and pore morphology, has not previously been
investigated.
2.4 MACHINE LEARNING IN ADDITIVE MANUFACTURING
The growing trend of automation with the emergence of ‘big data’ in industry is a
fundamental aspect of Industry 4.0. This term is used to help describe the current trend
of digitisation of manufacturing processes and has been defined as automation and
data exchange in manufacturing technologies, including cyber-physical systems, the
Internet of Things, cloud computing and cognitive computing, as well as the creation
of the smart factory’ [66].
Machine Learning (ML), which is a subset of Artificial Intelligence (AI) has become
increasingly important approach for the handling of large volumes of process data
within Industry 4.0. ML is defined as ‘Computer Programming designed in order to
optimise performance criterion based on experience’ [33].
ML can be divided into three main categories, namely supervised learning,
unsupervised learning, and reinforcement learning [34]. In supervised learning, each
item, consists of an input and output. The output ‘Y’ can be either a continuous or
categorical variable, whereas the input vector ‘X’ is a vector containing all features all
involved features in the form ( , , , , …., ), where N is the number of
features within the dataset [33]. Depending on whether the output ‘Y’ is a continuous
or categorical variable, two different forms of supervised learning can be used. In the
case of a categorical variable, Classification methods can be used, such as indicating
whether a part is porous or non-porous. If the output ‘Y’ is a continuous variable
Regression models can then be used. However sometimes input data does not have
an output label, and in this case unsupervised ML models are implemented. These
models instead infer meaning from the unlabelled data structure. Typical applications
of unsupervised learning involve Clustering methods, where input items are grouped
together based upon their similarity [34]. Another application of unsupervised methods
is anomaly detection, where anomalous input items are detected, relative to the input
dataset. Reinforcement learning, or semi-Supervised learning involves the model in
question interacting with its environment, and ‘learningto take the best course of
actions in order yield the greatest reward [35]. This approach does not require a
training set and learns from its own actions. Reinforcement learning is the type of ML
implemented in self driving cars [36].
AM technologies produce vast amounts of build data relative to other manufacturing
technologies, this has the potential through the application of machine learning tools,
to optimise the AM process at nearly every stage in the workflow [70]. A taxonomy of
ML applications within the AM industry is presented in Figure 11.
The application of ML to AM has ranged from optimising process parameters to
anomaly detection based on the analysis of in-situ process monitoring data [33]. A
major issue for the wider adoption of AM is the difficulty in achieving consistent part
quality, due to process defects such as porosity. Part quality is highly dependent on
process parameters based on those in the VED equation 1, in section 2.2.2. The
relationship between these process Parameters, to the produced Structure, and
consequently the Properties (PSP) have been discussed and reviewed in several
publications [37][38]. One method to ensure part quality and consistency is the
application of in process monitoring systems, however these systems require an
efficient defect detection system. Egan et al. implemented a statistical anomaly
detection algorithm known as the Generalised Extreme Studentized Deviate (GESD)
test, that was successfully applied post-process to in situ process monitoring data to
detect a wiper tear within the produced lattice structure [39]. Another method to
understand these PSP relations is through a series of experiments with the aim of
varying these process parameters in order to assess their impact upon component
properties. This can however prove both costly and time consuming.
2.4.1 ML APPLICATIONS IN AM
Figure 11: Taxonomy of ML applications within the AM industry [33].
Some current applications of Machine Learning within the Additive Manufacturing
industry are shown in Figure 11. The multitude of possible data sources lends itself
towards a variety of possible ML applications. Some combinations of these data
sources have been studied extensively, and are written in bold in Figure 12, and shall
be discussed further in the following sections.
Figure 12:Process Structure Property (PSP) relation chain schematic in AM [33]. Text
within boxes represents available data, while arrows and bold text represent examples
of potential ML applications input and output data sources.
2.4.1.1 DEFECT DETECTION, CLOSED-LOOP CONTROL AND
QUALITY PREDICTION
Recent advancements in the use of in-situ process monitoring systems have enabled
many different AM technologies to capture real-time data, that should enable defect
detection [42]. There are several ways in which ML models can be applied to this real-
time data, which can include images from a CCD camera source, thermal emission
data, as well as post process XCT scans of the components. One possible application
of ML to this data would be the use of unsupervised clustering to the data, whereby in
process monitoring data can be grouped into cluster that display regular, and irregular
behaviour, acting as a pseudo-anomaly detection algorithm in order to detect defects
[42]. A more time-consuming ML application to investigate of in-process data, involves
manually labelling the data, through XCT scan data, as either Defect or Non-Defect.
This labelling can be obtained through experimental results, or knowledge of the
defects present within the in-process data. This labelled data can then be used an
input to a supervised classification algorithm, which would then class every instance
of the data as defect/non-defect. A third method involves real time training of a
supervised regression model, where a random variation in one of the input features
is compensated for by a variation of the output label ‘Y’. Wang et al implemented
one of these real times learning regression models in a Material Jetting (MJ) process
[43]. In order to extract droplet features, image analysis of CCD camera images was
conducted, in order to extract four features, namely length, volume, speed and
satellite. The drive voltage used to stabilise the droplet was the continuous target label,
and through real time learning implemented in a convolutional neural network (CNN),
stochastic variations in input features were then compensated for by varying the drive
voltage value. It was shown that the process achieved increased stabilisation in droplet
behaviour.
2.4.1.2 GEOMETRIC ACCURACY ANALYSIS AND CONTROL
Common manifestations of defects in AM printed parts, include part distortion, as well
as porosity, along with a reduced surface quality, and geometric inaccuracies [43].
These may impinge upon the usability of AM in precision engineering applications,
such as in the aerospace, and the orthodontics sectors, where extremely tight part
tolerances are required [44]. To this end, ML algorithms have been implemented in
order to predict geometric inaccuracies, quantify this geometric deviation and
subsequently compensate for this deviation during the design of the part being
printing. Francis et al implemented this framework through a Convolutional Neural
Network model, applied to the LPBF process, which is indicated schematically in
Figure 13 [44].
Figure 13: Schematic outlining input parameters for CNN model applied to LPBF
production of Ti-6Al-4V. Input parameters include Thermal History from in-situ PM data
(a) as well as process parameters indicated in (b). Output predicted distortion
predicted in (c). CAMP-BD represent Convolutional and Artificial CNN applied to Big
Data [44]
The thermal history obtained through in-process monitoring data, in combination with
the process parameters of the system were used as inputs, and the distortion of the
part, obtained through XCT scanning was used as a continuous label for the CNN
model. The CNN model was then trained in order to predict distortion, which is then
reverse imported into the CAD model in order to compensate for the predicted error.
By this method, the accuracy of prints was significantly improved.
2.4.2 UNSUPERVISED CLUSTERING
In unsupervised ML, the data is unlabelled, such that each row vector contains a vector
of the form (, , , , …., ) with no output or ‘label’ [45]. As such, many
unsupervised algorithms’ main purpose is to uncover hidden patterns within the data,
or to group similar data together. One of the most common types of unsupervised
algorithms is this grouping, which is known as Clustering. In Clustering analysis, data
is separated into different numbers of groups based upon their similarity such that
objects in the same group are more like each other than they are to other groups.
Clustering can general be classified into two different types, either ‘exclusive’ or
‘overlapping’ clustering [45]. In hard clustering, datapoints can only belong to one
cluster, whereas in soft clustering they are assigned based upon the likelihood of them
belonging to a certain cluster. A schematic showing both a scenario of exclusive
clustering, and of overlapping clustering is presented in Figure 14.
Figure 14: Exclusive (left) and Overlapping Clustering (right) example. In exclusive
clustering, datapoints belong to one cluster only. In overlapping clustering, points may
belong to two unique clusters. In probabilistic clustering, each datapoint is assigned a
probability of belonging to a cluster, and then assigned to the cluster of highest
probability [49]
A plethora of clustering algorithms exist, and these algorithms differ from each other
based upon how they cluster data points. These different models, as well as being
‘exclusive’ or ‘overlapping’ clustering, can be classified into connectivity models,
centroid models, and distribution models [46][47][48]. Each of these types of models
have their own advantages and disadvantages.
The clustering sub-group Connectivity models are based upon the principal that
datapoints closer to each other are more like each other, than datapoints further away
from each other. An example of Connectivity clustering would be the Hierarchical
Clustering method, which uses distance-based metrics in order to calculate similarity
within the data [50]. Distance here, refers to how close each datapoint lies to another
within a unitised feature space. These distance-based metrics include Euclidean,
Manhattan, and Mahalanobis distance, and each vary depending on the how they
measure distance mathematically [50]. These models either follow a bottom-up
approach, where each datapoint is initially an independent cluster, and merged later
using a distance measure, or a top-down approach, where all datapoints initially
belong to one cluster, which is then split recursively through some distance measure.
The principal advantage of Hierarchical Clustering is that clustering results are
presented graphically in a ‘dendrogram’. This shows the number of clusters present
within the data at every possible distance [51]. This then informs the user the optimum
number of clusters present within the dataset. The number of vertical lines intersected
by a horizontal line, placed in the interval of the largest distance between successive
clusters is the optimal value for the number of clusters.
The centroid model clustering sub-group, similarly, to connectivity models uses
distance-based metrics, however these distances are measured between each point
and their respective cluster centroid rather than to other datapoints [53]. An example
of Centroid clustering is ‘K-Means Clustering’, one of the most widely used clustering
methods [53]. The ‘K’ in K-Means stands for the number of clusters within the model
and must be defined before implementing the model. Generally, dendrograms or
‘elbow plots’ can be used to infer the number of clusters present within a dataset.
The third form of Clustering, Distribution model clustering operates through assessing
certain probability criteria (dependant on the specific distribution clustering model in
question) and assigning points to the cluster with maximum probability [54]. One of the
most popular Distribution models is the Gaussian Mixture Model (GMM), and this
model uses the famous ‘normal’, or Gaussian Distribution to model each cluster. The
normal distribution describes the bell-shaped curve that contains as many points larger
than its mean as smaller than it, with the mean, median and mode all being equivalent.
The GMM uses probabilistic models which assumes a certain number of Gaussian
distributions, and each of these distributions represents a cluster [55]. The Gaussian
distribution, also known as the normal distribution, is usually represented in a bell-
shaped curve, where the assumption is that the dataset has an equal number of
measurements above and below the average (mean) value. If a particular distribution
is deemed ‘normal’, it will share the same mean, median and mode value [55]. While
k-means only considers the mean when determining a cluster centroid, in addition to
the mean, the GMM also considers the variance of the data [56]. The GMM model will
be applied in this research study, in order to investigate the effect of gas flow on pore
morphology in LPBF manufactured Ti-6Al-4V.
2.4.2.1: UNSUPERVISED CLUSTERING APPLICATIONS IN AM
There have been several reports of applications of Clustering within AM reported in
the literature, as discussed in this section. A Clustering algorithm that has been
frequently used in research is the Self Organising Map (SOM), which is a form of
neural network developed for unsupervised learning. Khanzadeh et al implemented
an SOM model in their investigation of a geometric inaccuracy [57]. A large dataset
containing the geometric inaccuracies of various processing parameters was
assessed, and clustered. This study concluded that the more clusters a particular set
of processing parameters produced, the more type of geometric deviations it produced
in terms of magnitude and direction.
Another study by Khanzandeh et al. applied SOM in the detection of defects in Ti-6Al-
4V, produced through DED [58]. This study involved anomaly detection within melt
pools through unsupervised clustering, and was predicated upon two assumptions:
Firstly, that an abnormal melt pool has no correlation with other melt pools; and
secondly, that the number of abnormal melt pools is far outnumbered by the number
of optimal melt pools. The melt pool temperature data was therefore clustered, and
the clusters with little correlation to other clusters were considered anomalous, and it
was found that porosity tended to form at these anomalous melt pool locations. Wu et
all also implemented SOM in the anomaly detection of abnormal acoustic emission
(AE) signals from fused filament fabrication (FFF) printed acrylonitrile butadiene
styrene (ABS) [59]. Initial feature extraction of the AE signal waveform was
implemented, prior to the SOM Clustering, and it was found that the anomalous AE
signals could be successfully detected using SOM, and that they corresponded to
polymer samples (ABS), that experienced failure mechanisms during printing such as
shrinkage, distortion, or scratching. Figure 15 depicts the procedure implemented by
Wu et al.
Figure 15: Procedure of implementation of Self Organising Map for Acoustic Emission
analysis of Fused Filament Fabrication produced ABS by Wu et al. Feature extraction
performed on in process acoustic emission signal, which is then used an input to SOM
algorithm. Subsequently anomalous signals were detected. [59].
Zhao et al used GMM and k-means techniques for the classification of in-situ thermal
imaging data obtained during the printing of Ti-6Al-4V parts by Direct Energy
Deposition [60]. The clustering models were used to identify defective melt pools in
this additive manufacturing process, which was then verified using XCT. It was found
that morphological melt pool features alone were incapable of identifying anomalies,
and that the thermal imaging features could potentially increase accurate anomaly
detection. Due to the GMM models’ ability to identify a combination of multi-
dimensional Gaussian probability distributions, it has the potential for the examination
of XCT porosity data. Snell et al applied K-Means clustering to XCT of a variety of
alloys produced through LPBF and found that this method of pore evaluation to be
much improved on traditional statistical optimisation methods such as defined class
limits with boundary optimisation [20]. Both methods predicted similar quantities of
Lack of Fusion pores (186 and 183 for K-Means and Boundary Optimisation
respectively) however a considerable difference in Keyhole pores was reported (373
and 1 for K-Means and Boundary Optimisation respectively). The difference in Keyhole
mode pores could possibly be due to incorrect initial boundary value selection for the
Boundary Optimisation method. The majority of clustering until now has been applied
as a form of anomaly detection, particularly common is the implementation of
clustering applied to discretised melt pool emissions, for the purpose of relating in
process data to the final printed parts [42].
2.4.3 SUPERVISED CLASSIFICATION
Supervised learning is the most widely applied ML technique within AM processing,
as it assists in addressing the problem of predicting a class or variable [33]. The
difference between supervised and unsupervised is that in supervised learning, the
data is labelled, such that each input vector of N features is of the form (, , ,
, …., , Y), where Y is the ‘label’ of that vector. The nature of this label determines
the type of ML model that is used, where a continuous label implies a regression
model, and a categorical label implies a classification model. In classification
problems, the model can be trained on the training set to make classifications and use
this training to classify new inputs.
Classification models make use of the training dataset in order to approximate a
mapping function ‘f’, that can take an unseen input vector and map it using the
approximated mapping function to a discrete output variable Y. Classification models
can generally be classified as either ‘lazy’ or ‘eager’ learners [61]. Lazy learners store
the training data, and classify new cases based upon the most similar data in the
training set. Lazy learners include models such as K-Nearest neighbour models. In
comparison with eager learners, lazy learners have reduced training time, with
increased prediction time. Eager learners in comparison construct a classification
model based on the training data, before receiving test data. Due to this construction,
eager learners exhibit the opposite behaviour to lazy learners, in that they have an
increased training time and reduced prediction time. Eager learners include Support
Vector Machines, and Neural Network models [61].
There are four possible types of classification tasks in supervised classification,
namely: Binary Classification; Multiclass Classification; Multilabel Classification; and
Imbalanced Classification [62]. Binary Classification, as the name implies involves
problems where only two class labels exist. An example binary classification problem
is classifying email spam as either spam or non-spam. In binary classification
problems, many class labels are in string format, e.g., ‘Cancer/non-Cancer’,
’Spam/non-Spam’. Therefore, many classification algorithms contain auto encoders
that convert these classes into positive and negative numeric labels 0 and 1. In binary
classification, it is common for many models to incorporate a Bernoulli distribution in
their mapping function [62]. This distribution is a discrete probability distribution that
predicts the probability ‘p’ of a test vector belonging to a class 1 [62]. The 0 class
then has a probability of (1-p). Some models are specifically designed for binary
classification such as Logistic Regression, and Support Vector Machines. A simple
example of binary classification is shown in Figure 16, where the input features are
simply X and Y grid coordinates, facilitating the classification of points into distinct
clusters [63].
Figure 16: An example of Binary Classification, where only two distinct clusters exist
within the dataset [63]
Figure 17: An example of Multiclass Classification, where multiple clusters exist within
the dataset [62]
Multiclass classification in contrast to binary classification, is where more than one
possible class label exists [62]. An example of multiclass classification is facial
recognition on a smartphone, or plant species classification as a variety of possible
faces or plants may need to be identified by the classifier. A common practise is to
model a multi class classification task with a model that can predict a Multinoulli
probability distribution [62]. This distribution is a discrete probability distribution that
covers cases where an event can have a categorical outcome, such as K in {0,1, 2,
K}. In classification tasks the model predicts the probability that a test example belongs
to each of the classes. Many algorithms that can be implemented in binary
classification can also be implemented in multiclass classification. Figure 17 shows a
scenario in which a distinct number of clusters are present, similar to the situation in
Figure 16. However, in this case multiclass classification is required, as there are more
than two clusters.
Multilabel classification is a situation where two or more class labels, where each item
may contain one or more labels. This is especially prevalent in image recognition,
where an image may contain labels such as ‘dog’, ‘tree’, and ‘person’. In these
problems, it is common to use models that can predict multiple outputs containing the
permutations of possible label combinations, with each output often containing a
Bernoulli probability distribution, essentially a model that implements a series of binary
classification predictions. Specialised models such as Multilabel Random Forest, and
Multilabel Decision Trees exist for this situation [62]. Imbalanced Classification refers
to cases in which the number of items in one class far exceeds the other class. This
is typically seen in binary classification problems, such as in fraud detection, anomaly
detection, and in medical diagnostic testing [62]. In order to compensate for this
imbalance, sometimes up sampling of the minority class is performed, or down
sampling of the majority class. Specialised imbalance algorithms also exist, that are
especially sensitive to the minority class, such as Cost-sensitive Logistic Regression
models, or cost-Sensitive Support Vector Machines. Figure 18 depicts a case where
Imbalanced Classification models would be necessary, due to the number of ‘0’ items
far exceeding the number of ‘1’ items.
Figure 18: A case requiring Imbalanced Classification, as far more cases of the blue
‘0’ class exists, in comparison to the orange ‘1’ class [62]
There is area a variety of classification algorithms that have been already applied in
AM technologies. The choice of algorithm depends on the type of data that is being
used, as well as the computation power available, as some algorithms are more
intensive than others [96]. Summarised in table 1 are several classification algorithms,
as well as their use in AM.
Both, Gradient Boosting and Extreme Gradient Boosting algorithms have become
increasingly popular in recent times [97]. Extreme Gradient Boosting in particular has
become one the most frequently used models in machine learning competitions and
is regularly the best performing model in these competitions [97].
Classification
Algorithm
Input
Output
Use
K-Nearest
Neighbour
Melt pool
characteristics
Porous or non-
Porous LPBF
components
Defect Detection
[98]
Support
Vector
Machine
Spectral Intensity
Graph
Defect detection in
Direct Metal
Deposition
Defect Detection
[99]
Naïve Bayes
Dimensional Variation
in LPBF components
Types of infill
Quality
Assessment [100]
Decision Tree
Classifier
Process Parameters
Surface Roughness
Quality
Assessment [101]
Multi-Layer
Perceptron
In-situ images
Defect Detection in
LPBF produced Ti-
6Al-4V
Defect Detection
[102]
Logistic
Regression
Process Parameters
Process parameter
optimisation for fully
dense components
Quality
Assessment [103]
Gradient
Boosted
Classification
Thermal Imaging data
Melt pool defect
detection in DED
produced metallic
alloys
Defect Detection
[104]
Extreme
Gradient
Boosted
Classification
Process Parameters
Distortion
prediction in
metallic alloy
components
Process parameter
optimisation [105]
Table 1: Examples of eight classification algorithms used in AM processes and their
specific uses.
2.4.3.1: SUPERVISED CLASSIFICATION APPLICATIONS IN AM
There are a wide range of possible applications for ML in AM technologies, specifically
supervised classification tasks. Some of the most applied techniques involve Decision
Trees (DT), Support Vector Machines (SVM), and Convolutional Neural Networks
(CNN).
One of the main reasons for the increased use of DT models in AM is their
interpretability in comparison to other ML techniques [63]. Khandadeh et al applied
Principal Component Analysis (PCA) to thermal imaging in situ data of DED produced
Ti-6Al-4V. This data was then either labelled as porous, or non-porous which was then
labelled as through XCT scanning of the components [63]. A series of supervised
classifiers were applied to this data, and DT showed consistent above average
performance. Although DT are a relatively simplistic model, they tend to perform well,
and serve as a good comparison model to more complicated classifiers.
The SVM model is one of the most common classifier models in AM [64][65][66] and
is used in binary classification problems. Each input output combination contains an
N dimensional input vector (where N is the number of input features) and its output
plotted in an N dimensional feature space. SVM then produces a hyperplane in that
feature space in order to partition the two groups. The choice of Kernel in SVM plays
an important part in the overall accuracy of the model, with some complex kernels
yielding an increased accuracy but increasing computation time. Zhang et al
implemented SVM for defect detection in in situ images, yielding 90.1% accuracy in
DED produced Ti-6Al-4V [65]. This approach was validated using XCT scans to
validate the SVM results.
The most popular classification method in AM is the Neural Network (NN) model [63].
Regular NN models such as Multi-Layer Perceptron (MLP) are sufficient when the
problem consists of parameters and classes, specialised NN models known as
Convolutional Neural Networks (CNN) are used for image-based problems. Scime et
al implemented a CNN model for defect detection in in situ images produced from
LPBF produced Ti-6Al-4V, and found defect detection, and defect differentiation
accuracies of 97% and 85% respectively [63].
Overall NN models are complex but perform strongly in most classification tasks [63].
While the choice of models used in classification should be based upon input features,
and computational intensity. Most classification models can be used in parametric
classification problems, however for image analysis or AE emission problems CNN
and SVM models are suggested. In this thesis, several classification algorithms will be
applied to tabular porosity data obtained from cross sectional microscopy images
examined using ImageJ software.
The application of supervised ML algorithms in AM has been predominantly oriented
around process parameter optimisation, or property prediction in AM components.
This thesis aims to use supervised classification to differentiate between image
analysis porosity data for a series of AM components produced at varying gas flows.
The area of supervised classification applied to image analysis data of porosity in AM
is the first to be investigated, to the authors knowledge. In general, in ML studies, it is
best to implement a series of classification models, and to assess their performance,
rather than optimising a single model.
2.5 LITERATURE REVIEW SUMMARY
This review in addition to providing an overview of AM, demonstrated the enhanced
capabilities of the technology when compared to traditional manufacturing techniques.
The differing types of AM processes are introduced, as well as a detailed discussion
regarding the areas and industries that AM has had its most considerable impact. Due
to their relevance to the research work the focus was on the LPBF process, with
particular attention to the powder bed laser interaction. The effect of individual process
parameters, and their relation to the formation of different types of defects, particularly
porosity was detailed. The effects of different levels and types of porosity, were
discussed along with their effect on the mechanical performance of the resultant alloy
parts. Although a number of authors have suggested the importance of inert gas flow
inside the chamber to the quality of the LPBF part, the direct influence of the gas flow
on the porosity of the part has to-date not been investigated, in detail. In this thesis
the impact of systematically altering the flow rate of the purge gas during the LPBF
printing of Ti-6Al-4V cylinders is explored, to evaluate how the gas flow rate influences
either the type or level of porosity in printed alloy parts. This study will also compare
the use of XCT and optical microscopy cross sectional analysis, as approaches for
investigating alloy porosity.
This literature study included an in-depth review of Machine Learning in general, along
with its specific applications to AM. This included the use of both Unsupervised
Machine Learning, specifically Clustering, in addition to the application of Supervised
Machine Learning, specifically Classification. The different forms of algorithms applied
in ML are discussed, as well as examples of where they are applied. The emergence
of ML as a vital statistical tool in the optimisation of AM is described.
Both the use of unsupervised and supervised machine learning are investigated in this
thesis for the analysis of porosity data. In the case of the latter, eight different
supervised classification algorithms are applied to micrograph data, with the aim of
evaluating algorithm performance, and ability to differentiate between the three types
of porosity. To this end, it is the first application of unsupervised clustering to XCT and
micrograph porosity data, as is the use of supervised classification to differentiate
types of porosity within an alloy matrix.
CHAPTER 3: EXPERIMENTAL METHODS AND MATERIALS
CHAPTER 3: EXPERIMENTAL METHODS AND MATERIALS
3.1 INTRODUCTION
This chapter provides an overview of the printing of the Ti-6Al-4V alloy used within this
porosity study. Details of the Renishaw RenAM 500M LPBF system used to print the
parts, along with it’s in-process photodiode monitoring system, known as InfiniAM are
also provided. The characterisation of porosity in the alloys is also discussed, using
both X-Ray computed tomography, along with the optical microscopy analysis of the
sectioned Ti-6Al-4V alloy samples. ImageJ was used in-turn for the analysis of the
resulting optical microscopy images.
3.2 MATERIALS: TI-6AL-4V
Gas atomised Ti-6Al-4V, was sourced from AP&C (GE Additive, Montréal Canada)
and was used for the manufacture of alloy test samples by LPBF. This material has a
mean powder particle size of 34 μm, with a quoted size distribution in the range 20
46 μm. Powder size is controlled through a sieving process, internal to the Renishaw
system. This removes powder above 32 μm. The powdered material has a
composition according to ASTM F3001 and is shown in Table 2, as per the AP&C test
certificate [22].
Al
V
Fe
O
C
N
H
Y
Other
6.34
3.81
0.019
0.13
0.03
<0.01
0.002
<0.01
<0.04
Table 2: Chemical Composition (%wt) of Ti-6Al-4V
3.3 RENISHAW RENAM 500M AND INFINIAM IN- PROCESS MONITORING SYSTEM
The LPBF RenAM 500M is a production scale AM printing system (Renishaw PLC,
United Kingdom), utilises a Yb: YAG fibre laser system, which operates in a modulated
mode [39]. The laser has a maximum power output of 500 W, at a wavelength of 1070
nm. The system has a maximum operational build volume of 250 mm x 250 mm x 335
mm. For the manufacture of Ti-6Al-4V, argon gas is used both to provide an inert
atmosphere as well to remove process by-products. A vacuum is first applied to the
build chamber reduce oxygen within the chamber to a level of 0.6%, once this level is
achieved argon is backfilled into the chamber to provide an inert atmosphere during
part printing.
Figure 19: Renishaw RenAM 500M LPBF System [23]
In order to investigate the effects of Ar gas flow on the porosity of the printed alloy
parts, the laser power, laser speed parameters etc., were kept, while the Argon gas
flow purging rate was varied. This was set at three different values namely Low
(26per hour), Medium (31per hour) and High (36per hour). Each build
contained 77 cylinders produced at the same locations within the build plate. For each
build, cylindrical Ti-6Al-4V specimens with a height of 25 mm and a diameter of 15
mm were printed, under the processing conditions given in Table 3. Seventy-seven
cylinders were printed at each of the three gas flow rates investigated. Seventeen
samples were selected from each print build for XCT porosity analysis, in order to
obtain a representative distribution across the three build plates. The test parts were
selected from the same print plate position for each build, as indicated in Figure 20.
The specimens were removed from the build-plate using wire EDM.
Scan Type
Laser
Power
(W)
Point
Distance
(µm)
Exposure
Time (µs)
Hatch
Distance
(µm)
Layer
Height
(µm)
Hatch
200
60
70
95
60
Border
160
20
30
40
60
Contour
200
75
50
50
60
Table 3: Processing Parameters used for the printing of the Ti-6Al-4V cylindrical parts.
Note the printing studies were carried out three times, within which the Ar shielding
gas flow was varied at three levels, as detailed in the text.
Figure 20: Photograph of a build plate, showing the cylindrical printed alloy parts.
The 17 parts chosen for porosity analysis are marked by orange squares.
A second sample set, which is referred to in Chapter 6 as the ‘Test Set’ was obtained
from a separate Ti-6Al-4V alloy printing study, which was carried out using the
Renishaw RenAM 500M system. This sample set was produced using the same
processing parameters as described in Table 3. The produced samples were
cylindrically shaped, with diameters of 10mm and a height of 50mm.
3.3.1 IN-PROCESS MONITORING
The Renishaw RenAM 500M LPBF system comes equipped with an inbuilt photodiode
based process monitoring system known as the InfiniAM Spectral system. A schematic
of this system is shown in Figure 21 [26].
Figure 21: Schematic of the InfiniAM Spectral System, the meltpool data labelled 18
is monitored using the photodiodes labelled 4 and 5 [26]
In Figure 21, a 500 W ytterbium laser emits light with a wavelength of approximately
1070 nm, as seen in label 1. A fraction of the laser passes through a fixed mirror and
is detected in the LaserVIEW module, at label 10, which records information about the
laser input energy at a rate of 100 kHz. Emitted radiation from the meltpool, labels 13,
15, 18, is transmitted back up the laser path, where it is ultimately recorded by two
photodiodes at a rate of 100kHz, at labels 4 and 5. The photodiodes make up the
MeltVIEW system, where one photodiode 4 records visible light emissions, and the
second labelled 5, measures thermal, infrared (IR) emissions. In total, there are three
photodiode sensors, at labels 4, 5, 10.
3.4 X-RAY COMPUTED TOMOGRAPHY
The porosity of the printed cylindrical parts was evaluated using X-Ray Computed
Tomography (XCT) using a GE Nanotom M system, which has a resolution of 12.5
μm [27]. The XCT scans were analyzed using the porosity/inclusion analysis (PIA)
module in VGStudio Max version 3.5 [37]. Datos Rec Software reconstructs images
taken by XCT, while VGStudio Max imports these XCT volumes and displays the
reconstructed component as a 3D model. The surface determination tool allows for
the program to identify the material/air boundary by measuring the grey values of
individual ‘voxels’, or 3-dimensional pixels. The PIA module examines the grey value
of each voxel within the 3D model, to check if it is a part of a pore/void, and a group
of related voxels is allocated to be a part of the same pore. A region of interest was
created for analyses, which spanned from 23 to 24 mm. The selected region avoided
variations between parts caused by the engraved labels which had been placed on
the top surface of the parts, as well as any damage associated with the use of wire
EDM, to remove the parts from the build plate. A Porosity Inclusion Analysis (PIA)
module VG Easy Pore was applied to the parts using a minimum filter volume of 60,
to minimize the effect of noise from the XCT scan data being labelled as porosity.
The VGStudio MAX PIA module outputs a list of features relating to each individually
identified pore. The features diameter, voxel volume, surface area, sphericity,
compactness, projected height in the z-direction, and aspect ratio were chosen as
an input dataset to the clustering algorithm. As per the VGStudio Max reference
manual, the feature ‘diameter’ represents the diameter of the circumscribed sphere
of the defect [38]. The feature ‘sphericity’ indicates the ratio between the surface of
a sphere, with the same volume as the defect and the surface of the defect, as
expressed by Equation 2, whereas the ‘compactness’ can be expressed by the ratio
between the volume of the defect and the volume of the circumscribed sphere, as
shown in Equation 3 [38]. The aspect ratio is defined as the projected pore size in
the x-axis, divided by the projected pore size in the y-axis. The projected size in the
z direction, is the pore height perpendicular to the build plate. Voxel volume is
defined as ‘a value on a regular grid in 3-dimensional space’ analogous to pixels in
a 2D image.
 
 Equ. 2
 
 Equ. 3
Unsupervised clustering was applied to porosity features, which were obtained
from the XCT models for the 17 cylindrical parts selected across each of the 77
cylinders printed at the three gas flow rates investigated. The unsupervised
clustering model was implemented using Python Jupyter Notebooks, and the open-
source ML repository SciKitLearn [39]. The 7 input features used were: diameter;
voxel; surface area; compactness; sphericity; projected size in the z direction; and
aspect ratio.
3.5 OPTICAL MICROSCOPY EXAMINATION
The porosity of the printed components was also evaluated based on optical
microscopy examination of the sectioned components. This involved obtaining cross
sectional microscopy images at specific layers of the sectioned cylindrical parts, then
sectioning 8 randomly chosen cylindrical parts, obtained for parts selected from each
of the three gas flow levels investigated, using a Buehler IsometTM High Speed Pro
precision saw. The samples were ground using 2400 grit silicon carbide paper and
then polished using a 9 µm diamond paste. The exposed cross sections were then
etched by treating with hydrogen peroxide and silica solution. Optical microscopy
images were then taken using the Olympus DSX1000 Digital Microscope and the
results obtained at magnification of 624 and 275 times.
Numerical analysis of the microscopy images was facilitated based on their analysis
using ImageJ software. This Java-based image processing program uses greyscale
thresholds of the microscope images, as an input to the program, thus allowing the
determination of regions of porosity [40]. An example of ImageJ porosity analysis
applied to a micrograph sample is shown in Figure 22. The features that ImageJ
extracted about identified porosity were the ‘area’, ‘circularity’, ‘aspect ratio’,
‘roundness’ and ‘solidity’ of the 2D pore. Solidity refers to the ratio of a pore’s area to
its convex area [40]. The convex area is the smallest convex area that contains all
points within the pore. The aspect ratio in this case refer to the ratio of the largest axis
in a pore to the smallest axis in a pore. The circularity of a pore is defined in equation
4. Roundness is defined as given in equation 5, where ‘major axis’ represents the
largest axis within the pore [40].
 
󰇛󰇜 Equ. 4
 
󰇛󰇛󰇜󰇜 Equ. 5
Figure 22 An example of ImageJ porosity thresholding applied to a micrograph
image (Left) of a Ti-6Al-4V produced on the Renishaw RenAM 500M. Based upon
grayscale values of each pixel, each pixel in the micrograph is labelled as ‘Porous’ or
‘Non-Porous’ (Right), and geometric features relating to each individual pore are
calculated using the porosity analysis module available in ImageJ. The thresholded
image contains porous point (In Black) and solid material (In White). Note that while
the Bakelite mount surrounding the solid material is also black, this area was not
included in porosity analysis.
CHAPTER 4: RESULTS OF THE POROSITY ANALYSIS
CHAPTER 4: POROSITY ANALYSIS RESULTS
4.1 INTRODUCTION
This chapter provides the results of the porosity study carried out using XCT analysis
of the printed Ti-6Al-4V alloy parts, printed at the three Ar gas flow rates investigated.
It also details the results obtained from a cross-sectional analysis study of the alloy
parts, which was combined with ImageJ porosity analysis. Pore type results obtained
from both methods are discussed, as well as an analysis of the overall porosity
obtained using both methods. While a direct comparison of both methods cannot be
made due to the differing resolutions, as well as the fact that the CT scan provides
information on the entire cylinder, while the cross-section analysis provides details on
individual cross-sectional layers. Nevertheless, the outputs from both analysis
methods can be obtained.
4.2 COMPARISON OF POROSITY OBTAINED THROUGH XCT AND MICROSCOPY
This study compared the porosity results obtained using XCT and optical microscopy,
with the latter provide image data at two different magnifications (624 and 275 times).
The latter data was in-turn analysed using ImageJ, in order to evaluate the level of
porosity. The increased magnification dataset was used in order to obtain a better
resolution of the porosity observed. The lower magnification of 275X was used in a
comparison with the porosity data obtained from XCT. The XCT data was examined
using a Porosity Inclusion Analysis (PIA) module ‘Easy Pore’ available in VGStudios
XCT analysis software, as discussed in section 3.4. A reconstruction in VGStudios of
a cylinder produced at the same build plate location, in each of the three gas flow
builds is presented in Figure 23. The blue markings representing the porosity. Through
visual inspection of Figure 23, the marked increase in porosity with reducing Ar gas
flow rate is evident.
Figure 23 VGStudios XCT Reconstruction of gas flow cylinder sample produced at
same build plate location from high gas flow (36 per hour) (left); medium gas flow
(31 per hour) (centre); and low gas flow (26 per hour) (right)
Figure 24 Distribution of XCT Porosity across the build plates, based on
measurements obtained from 17 samples examined for each of the three gas flow rate
builds obtained from 17 samples examined for each of the three gas flow rate builds
Figure 24 shows a contour plot which was obtained based on an analysis of XCT
scans of each of the gas flow rate builds. The contour plot shows the average detected
porosity of each cylinder plotted at its location on the build plate. Again, it is evident
from this figure, that the low gas flow build contained a substantially larger amount of
porosity compared to the medium and high gas flow builds, with the medium gas flow
containing marginally more porous compared with the high gas flow build. From Figure
24 it is also clear that higher levels of porosity are obtained to the right of the contour
plot image, obtained for the low gas flow rate print. This indicates that the higher
porosity is in fact close to the gas inlet rather than away from it. This result is surprising
as it is anticipated that with the deposition of process by-products such as condensate
that higher levels of deposition would be obtained in the print bed region away from
the Ar gas inlet. Note the relatively homogeneous distribution of porosity for the print
beds obtained at medium and the high gas flow rates.
Figure 25 shows an example of a cross section image from the low gas flow taken at
275X magnification. This sample was from rod 77, which was printed close to the gas
inlet, and was found to contained considerable levels of porosity. This was also evident
based on the XCT analysis data (Figure 23).
Figure 25: An example Cross Section Image taken from the Low Gas Flow build,
demonstrating a considerable level of lack of fusion porosity
Figure 26: Comparison between images of Ti-6Al-4V obtained using XCT (a), and from
the approximately the same area obtained after cross sectioning, followed by
microscopy examination (magnification 275X) (b), for a sample printed at the low gas
flow rate. Note the large ‘key-shape’ LOF pore, with an approximate length of 800 μm
in the y-direction at point 1 on the XCT cross section (a), with the corresponding pore
in the microscope image (b).
In order to make a comparison between the porosity analysis methodologies, an
investigation into the minimum detectable diameter of pore associated with each
method was undertaken. The minimum diameter of pores detected using the XCT PIA
was approx. 80 µm, which is associated with selecting a minimum pore detection
volume of 60 voxels during the VGStudio Max PIA analysis step. This minimum
volume was selected to ensure that noise in the XCT scans was not falsely labelled
as porosity. This is thus a limitation in the pore resolution which can be obtained using
the PIA analysis approach. In order to help establish the accuracy of the XCT results,
optical microscopy examination of porosity in alloy cross sections were also examined.
Figure 26 provides an example of a comparison between XCT and cross section
optical micrograph of approximately the same layer region in the titanium alloy. It is
clear from this Figure that at approximately the same magnification level, that
microscopy facilitates the identification of smaller pores i.e., at points 3 and 4. A feature
of note in this image which was obtained for a part printed at low Ar flow rate, is that
the porosity is mainly associated with LOF porosity.
The minimum recorded pore diameter from ImageJ analysis of the Micrograph cross
section samples at an increased magnification of 624X was found to be 0.5 µm. In
comparison the minimum detectable diameter at a magnification of 275X was found
to be 1.3 µm.
Table 4 compares the average level of porosity recorded for the print builds obtained
at the three gas flow rates, when analyzed based on ImageJ examination of the
micrographs, along with the XCT measurements. The Micrograph images were taken
at two magnifications, as shown. There was a total of 55 images taken at 275X
magnification. While there was a total of 9 images taken at 624X magnification. In
each of the magnifications, as well as in the XCT analysis a reduction in overall
porosity is seen with increasing Ar gas flow rate. This is most likely since a reduction
in purging gas flow rate results in less removal of process by-products, which
increases the likelihood of defects, during the LPBF process [61].
Ar Gas Flow Rate:
XCT Porosity
Volume Percent
ImageJ
Micrograph
Porosity
(M=275X)
ImageJ
Micrograph
Porosity
(M=624X)
Low
0.21%
0.65%
3.64%
Medium
0.03%
0.10%
0.81%
High
0.01%
0.02%
0.4%
Table 4: Comparison between the porosity measurements obtained based XCT and
Micrograph ImageJ Porosity Measurements.
Although a direct comparison between XCT and micrograph results is not possible, it
is interested to note that the same trend is present in both methods. The highest level
of overall porosity in each method is present within the low gas flow rate builds, while
a decrease is observed in the medium gas flow rate build, and consequently the high
gas flow rate build.
4.3 CONCLUSIONS
This study investigated the effect of argon gas flow rate during the L-PBF printing of
Ti-6Al-4V on the resulting alloy porosity. XCT analysis demonstrated that the build
manufactured under a low gas flow rate (26 m3 per hour), had significantly increased
levels of porosity relative to the build manufactured with the medium (31 m3 per hour)
and high gas flow (36 m3 per hour) rates. This is most likely due to insufficient
removal of process by-products from the build plate because of a reduced gas flow
rate. The build manufactured with a high gas flow rate had broadly similar levels of
porosity to that manufactured with a medium gas flow rate. For a commercial
manufacturing process where Ar gas costs are a significant consideration, this would
indicate that the medium gas flow rate could be selected for routine production, in
order to reduce the level of Ar used during part printing. This is the case for the
RenAM 500M, where Renishaw’s recommended flow rate is the 31per hour,
‘medium rate.
The level of porosity obtained using XCT was compared with that obtained based
on cross sectional microscopy images of the printed alloy. Taking the example of
the low gas flow rate the porosity determined by XCT was 0.21%, while that obtained
using the micrograph examination at approximately the same magnification was
0.65%. The limited minimum detectable diameter which can be obtained using the
XCT analysis technique of 12.5 µm, is a significant factor to the porosity differences
obtained using the two techniques.
It is important to gain stress however, that due to the differing resolutions of these
two methods, they cannot be compared directly. Although micrograph data enables
the analysis of porosity at a smaller resolution than XCT, the sectioning of samples
required to obtain the micrograph is a time consuming process, and only reveals a
fraction of the total porosity present within a sample. XCT analysis is faster, and
produces porosity information on entire samples, however at a lower resolution.
CHAPTER 5: UNSUPERVISED CLUSTERING AND CLUSTER NUMBER
EVALUATION:
CHAPTER 5: UNSUPERVISED CLUSTERING AND CLUSTER NUMBER EVALUATION
5.1: INTRODUCTION
In Chapter 4 the overall level of porosity was evaluated based on both XCT and optical
microscopy examination of the Ti-6Al-4V alloy parts. As detailed in the literature review
for these LPBF alloy parts there are generally three different forms of microstructural
porosity considered: Gas pores; Keyhole pores; and Lack of Fusion pores. In addition
to their variation in morphology, the type of porosity also impacts on the mechanical
properties of the resultant component [117][118]. Lack of Fusion pores have the
largest impact on mechanical components due to their ability to act as stress
concentrators, and subsequently crack initiators [119][120]. Keyhole pores also have
a considerable impact upon mechanical performance, although not as drastic as Lack
of Fusion pores [117]. Therefore, it is important to be able to distinguish between the
three forms of microstructural porosity.
This chapter provides the results for the application of the Gaussian Mixture Model’
(GMM), unsupervised machine learning technique as an approach to evaluate the type
of porosity present in the parts. This examination is based on the XCT and microscopy
datasets for porosity analysis of the Ti-6Al-4V cylinders. In order to carry out this
investigation it is necessary to determine the number of clusters present within the
data based on an investigation of clustering models. This is followed by the
implementation of four different statistical approaches to determine the number of
clusters present within the XCT and micrograph datasets. The four approaches
investigated were a dendrogram approach, an elbow plot approach, a Bayesian
Information Criterion (BIC) score approach, as well as a Silhouette Score approach.
Finally, the conclusions from the implementation of the GMM approach to the three
datasets will be reviewed. The unsupervised clustering results from the GMM model
at each gas flow rate will be presented in each of the XCT and micrograph datasets.
5.2: UNSUPERVISED CLUSTERING
Unsupervised clustering was applied to three datasets obtained from the XCT and
optical microscopy analysis (evaluated at two magnifications). The XCT dataset
contained 95,489 datapoints obtained from the total of 51 cylinders examined at the
three gas flow rates investigated. Each datapoint containing 7 independent features,
as well as a categorical variable ‘Gas Flow’. The ImageJ analysis of optical microscopy
image datasets, both contained 5 features, and the same Gas Flow variable. The log
magnification dataset was taken at a magnification of 275X, containing 134,987
datapoints, while that obtained at the higher magnification of 624X, containing 249,826
datapoints.
The GMM model is a probabilistic model, that assumes all data points were generated
from a finite number of Gaussian distributions with unknown parameters [121]. For this
study the GMM model was implemented using Python Jupyter notebooks. GMM
models can be thought of as a generalised K-Means clustering model that incorporates
the covariance structure of the data. GMM clustering is also soft clustering in
comparison to K-Means, which is hard clustering. A cluster with regards to data
science can be defined as grouping datapoints together into groups (or clusters) such
that points in each cluster are more like other points within that cluster than points
within other clusters [122]. This is the basis of unsupervised clustering, which is used
in order to examine clusters present within datasets.
The covariance of the model refers to the covariance matrix of an n-dimensional
Gaussian distribution [121]. The covariance constraint for GMM models can be one of
four types: Spherical; Diagonal; Tied; Full. In higher dimensions, the univariate and
bivariate Gaussian distribution can be rewritten using a mean vector µ, and an n-by-n
covariance matrix Σ. In 2 dimensions, the Gaussians parameters may look like this
[121]:
󰇣󰇡
󰇢󰇡


󰇢󰇤 Equ. 6 [121]
The right-side matrix in this equation defines the covariance matrix. The values within
this matrix decide what covariance type fits the data [123]. In spherical covariance, off
diagonal elements are all 0, with all diagonal elements being of equal value. All
nondiagonal elements being 0 indicates parameters have no correlation, while
equivalent diagonal values result in a circular probability distribution indicating the
spread of the Gaussian distribution along these axes is the same. Diagonal and tied
covariance imply 0 nondiagonal values, while diagonal values can be any nonzero
values. The difference between tied and diagonal, is that in diagonal clustering
Gaussian distributions for each cluster can have differently oriented probability
distributions whereas in tied all Gaussian distributions must be the same orientation.
Full covariance allows for correlation between variables, meaning nonzero values for
off diagonal covariance matrix entries.
Figure 27 shows the results obtained when all covariance constraints were plotted, for
an example open source Scikit learn dataset known as ‘Iris’[121]. The data generated
is four dimensional, however only two dimensions are plotted, thus some points are
separated in other dimensions. Each datapoint belongs to one of three classes,
namely Setosa, Versicolor, and Virginia. As can be seen in the spherical covariance
sub-plot in Figure 27, resultant clusters are spherical in shape, although circular in the
image due to being reduced to 2 dimensions. Diagonal covariance results in clusters
that may be independently oriented either elliptically or spherically, however this must
be along the coordinate axes, as seen in Figure 27. Tied covariance differs from
diagonal covariance, in that clusters may be oriented in any direction regardless of
coordinate axes, however all clusters must be oriented identically. Full covariance then
allows for clusters to independently adopt any shape.
Figure 27: Covariance Constraint parameters for GMM plotted on open source ‘Iris’
data. Training datapoints marked with crosses, while test datapoints marked with dots.
The GMM model was trained using the training dataset, while evaluated then upon the
test set [121].
The GMM model required the input of two parameters: the number of clusters, along
with the covariance constraint [121]. For this research, full covariance was
implemented in order to allow for any possible cluster orientation.
5.3: CLUSTER NUMBER EVALUATION
In most clustering problems, the model requires the user to input the number of
clusters present within the dataset, the clustering models will then attempt to split the
data into the predetermined number of clusters [20]. There are a variety of approaches
that can be taken in order to determine the number of clusters present, which can then
be used as an input to the clustering algorithm. In this thesis four approaches were
investigated as follows hierarchical clustering evaluation, elbow plot evaluation,
Bayesian Information Criterion, and silhouette score evaluation. All four approaches
are discussed in-turn, there were all implemented in Python Jupyter Notebooks, using
Scikitlearn open-source packages.
5.3.1: HIERARCHICAL CLUSTERING AND DENDROGRAM APPROACH
This is a distance-based approach, based on inputs of the Euclidean distance,
Manhattan distance, or Mahalanobis distance, to calculate the closeness of the data
points [20]. A limitation of this type of clustering, however, is that it cannot handle
extensive datasets and takes a higher amount of computing power and time [42]. One
of the advantages of hierarchical clustering is that the results are represented
graphically through a ‘dendrogram’, which shows the clustering approach at different
hierarchical levels [42]. The dendrogram also informs the number of cluster choices
that will best depict the data groups. To find the optimum cluster choice, the maximum
vertical distance of the dendrogram is transversed by a horizontal line, this is marked
as the red line in Figure 28, using an example dendrogram. The number of vertical
lines in the dendrogram intersected by the horizontal line, that can spread the entire
distance vertically without intersecting a cluster, is the optimal value for the number of
clusters. Hierarchical clustering was carried out for both the XCT and micrograph
datasets, in order to determine the optimal number of clusters. However, the
dendrogram approach may not always yield an optimal cluster number, as evident in
Figure 28, where the largest vertical distance (above clusters A and B, marked in
green) can mean any of 2, 3 or 4 clusters. In this case, other methods of determining
cluster number, such as those discussed in sections 5.3.2 to 5.3.4 of this thesis can
be used.
Figure 28: Two-dimensional feature space, with 6 points and resultant dendrogram.
As can be seen from the dendrogram, no clear cluster number is evident. The largest
vertical line (green) can be seen to intersect any of 4, 3 or 2 vertical lines depending
on where the horizontal (red) line is placed [123]
5.3.2: HIERARCHICAL CLUSTERING RESULTS
Figure 29: Hierarchical Clustering Dendrogram plots for (a) Low Gas Flow Rate (26
per hour) (b) Medium Gas Flow Rate (31 per hour) (c) High Gas Flow Rate (36
per hour)
Figure 29 shows the hierarchical clustering dendrograms for the three gas flow rates
investigated. In each of the images, the blue section of the dendrograms contains the
largest vertical distance. In (a) the ideal cluster number is 2. A horizontal line
intersecting the left side blue vertical line here would intersect one other vertical line
for most of this line, indicating that the ideal cluster number in this gas flow is 2.
However, in the medium gas flow, the ideal cluster number is unclear. In (b), the left-
hand blue line is the largest vertical line again, however a horizontal line intersecting
this line could possibly intersect any of 2, 3, 4 or 5 clusters, as no clear cluster number
stands out in this graph, introducing an element of subjectivity into the choice of the
optimum cluster number. However, in (c) the optimum cluster number similarly to (a).
appears to be 2. The lack of a clear cluster number from this dendrogram approach
indicated that this method is not a suitable method for determining cluster number.
The subjectivity in determining the cluster number in (b) led to further investigations
into cluster number determinations. Another aspect that required alternative methods
of cluster determination, was that soft clustering methods such as GMM that use a
probabilistic Gaussian distribution to determine the most likely cluster for a datapoint,
don’t use hierarchical clustering to determine cluster number. This method is more
suited to hard clustering methods, such as K means clustering [123].
5.3.3: ELBOW PLOT APPROACH
This approach is widely used in cluster analysis, as it generally produces an easy to
interpret result [48]. In this approach, the average distance between a datapoint, and
its cluster centre is plotted, at every cluster number value. The user then looks for an
‘elbow’ or a drastic change of slope from steep to shallow, in order to determine the
optimum number of clusters (see Figure 30 (left)). This elbow point indicates the
optimum number of clusters, as although the average distance to cluster centre may
decrease with an increased cluster number at this point, the drastic change in slope
indicates that this change per additional cluster is only marginal. This is a case of
diminishing returns, where additional clusters increase complexity without reducing
the average squared distance sufficiently. The main drawback with this method, similar
to the dendrogram approach, is its subjectivity, as the elbow point may not always be
definite, and this introduces an element of choice in the cluster value. Figure 30 shows
two cases, one (left) where the elbow is clear, and another (right) where no definite
elbow is present [48]. In this graph the sum of squared errors is plotted against cluster
number in comparison to average squared difference.
Figure 30: Elbow plot comparison, with definite elbow point present (left), and no elbow
point present (right). This lack of clear elbow in plot indicates that a new cluster number
evaluation method required [48]
5.3.4: ELBOW PLOT RESULTS
Figure 31: Elbow Plots for XCT analysis of LPBF produced Ti-6Al-4V cylinders
produced at: (a) Low Gas Flow Rate (26 per hour) (b) Medium Gas Flow Rate
(31 per hour) (c) High Gas Flow Rate (36 per hour)
Figure 31 shows the elbow plots for the three gas flow rates investigated in this thesis.
It is evident from the three plots that no clear elbow occurs in any of the three plots.
When data from all three gas flows were collated into one combined dataset, the elbow
plot did not indicate a clear cluster number either. Because of this ambiguity, the Elbow
plot was not found to be suitable for the evaluation of cluster number for this
investigation.
5.3.5: BAYESIAN INFORMATION CRITERION APPROACH
The Bayesian Information Criterion (BIC) is a method for scoring and selecting a
model, or for determining the optimum number of clusters for a defined model [96].
The BIC score is named for the field from which the value was derived: Bayesian
probability and inference This score is used for models that fit under the maximum
likelihood framework. BIC is calculated for a model using equation 7 [96]:
 󰇛󰇜 󰇛󰇜 Equ. 7
Where d is the number of parameters in the model and N is the number of examples.
The BIC score is minimised, with a lower score indicating a better model. BIC attempts
to reduce the effect of overfitting through the introduction of the penalty term 󰇛󰇜,
which penalises models with large number of parameters [98]. Consequently, the BIC
score approach, prefers simpler models.
As a result of the derivation of the BIC score from the Bayesian probability framework,
if a series of models under evaluation contains a ‘true model’, the probability that BIC
will indicate the true model increases with an increasing number of samples N [96]. A
disadvantage of the BIC score is that for small datasets it tends to favour oversimplistic
models [96].
5.3.6: BIC SCORE RESULTS
Figure 32: Low Gas Flow XCT data BIC values vs Cluster number for GMM clustering
Figure 32 shows a BIC plot for the low gas flow data obtained from the XCT analysis
of the Ti-6Al-4V cylinders. This plot shows the BIC score obtained from a GMM applied
at each cluster value from 1 to 14 clusters. This resulted in 4 individual line graphs, for
each of the GMM covariance types, as discussed in section 5.2. No minimum BIC is
reached in any of the covariance types, with each type converging to some minimal
value far outside the above cluster interval. This suggests that the GMM is having
difficulty deciding between cluster memberships. This implies that BIC scores in this
case cannot provide a clear indication of the correct number of clusters, as no definitive
minimum exists.
It is interesting to note that the Full covariance yielded the smallest BIC value in this
specific case, followed by diagonal and tied, with spherical covariance yielding the
largest BIC value. This indicates that there is some probable relationship between
variables in the XCT analysis data. The fact that spherical covariance is approximately
an order of magnitude larger than the other three covariances, implies that spherical
GMM clusters are most probably not the most accurate clustering technique.
As there were three gas flow rates in the LPBF process investigated in this study, as
well as porosity data from three different sources, in order to combine the BIC score
analysis of each of these methods a three-by-three matrix plot is presented in the
appendix section. In each case, the optimal number of clusters was still unclear.
5.3.7: SILHOUETTE SCORE APPROACH
Silhouette scores is a clustering evaluation metric used to calculate how well a
clustering technique grouped datapoints [99]. Its values range from [-1,1]. A value of
1 indicates that clusters are well apart from each other, and clearly distinguished. A
value of 0 means that clusters are indifferent to each other, while a score of -1 indicates
clusters are incorrectly assigned.
The silhouette score for a datapoint ‘i’ is calculated from the following equation:

󰇛󰇜 Equ. 8 [99]
Where is the inter cluster distance, and is the intra cluster distance. The inter
cluster difference is the average distance to the closest cluster to datapoint ‘i’, that it
is not part of. Mathematically this is given in Equation.


󰇛󰇜
 Equ. 9 [99]
Where is the closest cluster to ‘i’. The intra cluster distance is defined as the
average distance between datapoint i and other datapoints within the cluster it is a part
of. Mathematically this is given in Equation X.
 󰇛󰇜
 Equ. 10
The overall silhouette score for a dataset can be calculated as the average silhouette
score for all datapoints within the dataset. Graphically, the intra and inter cluster values
are shown in Figure 33.
Figure 33: Depiction of values used in silhouette score calculation [99].
5.3.8: SILHOUETTE SCORE RESULTS
The fourth and final cluster number determination method implemented was the use
of silhouette scores. As described in section 5.3.4 silhouette scores range from -1 to
1, with a value of 1 indicating a perfect cluster number. In this calculation, GMM models
with full covariance were fitted with cluster values ranging from 2 to 9 and average
silhouette scores at each cluster number were obtained. This resulted in 9 plots owing
to the three gas flows and three data sources. Figure 34 shows 3 plots for each of the
data sources, with 3 gas flows plotted per data source in each graph.
Figure 34: Silhouette scores for GMM plot with full covariance per cluster number for
each data source. Each Data source containing three gas flow rates. Note that in (c)
although the silhouette scores for a cluster number of 2 and 3 appear identical in the
High and Medium gas flow, the value at a cluster number of 2 is marginally larger.
Figure 34 shows the silhouette score for a GMM plot with full covariance at each
cluster number. A cluster number of 2 achieves the highest silhouette score for each
of the gas three gas flow rates investigated. This was the first definitive indication for
a cluster number based on the four approaches investigated in this study. Note that in
the case of the optical microscope datasets that for the medium and high gas flow
datasets, although the cluster numbers of 2 and 3 appear to have an identical
silhouette score, a cluster number of 2 has a marginally larger silhouette score than 3
in both cases. Therefore, for GMM analysis of the datasets, a cluster number of 2 was
used. It was interesting to note that for the 9 individual line plots, the same optimal
cluster number was obtained.
5.4: CLUSTER NUMBER RESULTS SUMMARY
In this study the performance of four different approaches were investigated in order
to determine cluster number in each dataset. Initially, hierarchical clustering was
implemented, however no clear cluster number was evident from the produced
dendrogram for any of the data sources. Following on from this, an elbow plot for each
of the data sources was produced. Similarly, to the hierarchical clustering issue, no
clear cluster number was evident. Subsequently BIC scores were calculated at each
cluster number for each gas flow in each of the micrograph sources and the XCT. A
lower BIC score indicates the most accurate cluster number value, however in each
data source BIC scores never converged to an accurate cluster number value perhaps
indicating that the GMM struggled to accurately assign cluster membership to certain
datapoints. The fourth method evaluated involved calculating silhouette scores for
each of the data sources. It was concluded from the fourth study that a cluster number
of 2 was the most appropriated for use with each data source.
5.5: UNSUPERVISED CLUSTERING USING GMM RESULTS
The results of GMM clustering with full covariance applied to the XCT and both
micrograph datasets are presented in the following section. Initially the XCT results
are presented, followed by the micrograph results obtained at a magnification of 275X,
and finally the micrograph results obtained at 624X. Porosity sphericity data is
presented with the XCT data, while circularity is presented in conjunction with the
porosity obtained using optical microscopy examination. The data is provided based
on the 2 clusters present within each of the data sources, as determined in the
previous sections.
5.5.1: GMM PLOT FOR XCT DATA
Figure 35: GMM Results for XCT Analysis data from (a) Low (b) Medium (c) High
Gas Flows. Total of 95489 Pores
In Figures 35(a), (b), and (c). the GMM results obtained from an analysis of the XCT
data are presented. The cluster membership average feature values of both clusters
at each gas flow are presented in table 5
For each of the three gas flow rates investigated, clusters were initially labelled 0 and
1. Through examination of the average features for each cluster, presented in table 5
it was decided that cluster 0 in each gas flow would be labelled as ‘Regular’ type
porosity, with cluster 1 being labelled ‘Irregular porosity’. The Irregular cluster
encompasses a substantial amount of both Keyhole and Lack of Fusion porosity. The
Regular pores have diameters of less than approx. 0.3 mm, and these have a large
contribution from gas type porosity. This labelling of clusters into Regular or Irregular
type porosity, enabled a comparison of pore type morphology across gas flow rates.
Gas
Flow
Diameter
[mm]
Voxel
Surface
[]
Sphericity
Compactness
Projected
Size z
[mm]
Aspect
Ratio
Low
0.18
155
0.04
0.52
0.11
0.11
1.07
Low
0.45
137
7
0.34
0.31
0.06
0.27
1.08
Medium
0.18
149
0.04
0.53
0.11
0.11
1.06
Medium
0.39
980
0.24
0.33
0.07
0.22
1.08
High
0.17
139
0.04
0.56
0.13
0.10
1.05
High
0.38
960
0.23
0.34
0.07
0.19
1.05
Table 5: Cluster Average Feature Values across each Gas Flow for XCT data. Dark
rows represent Irregular clusters, and light rows represent Regular clusters
Gas Flow Rate
Regular Porosity
Proportion
Irregular Porosity
Proportion
Low (26 per hour)
79.6%
20.4%
Medium (31 per hour)
85.7%
14.3%
High (36 per hour)
90.6%
9.4%
Table 6: Pore Type proportion at each gas flow obtained from XCT data
The proportion of each pore morphology type (Regular or Irregular), is presented in
Table 6. This demonstrates that as the proportion of Regular type porosity increases
with increasing gas flow rate, there is a decrease in the level of Irregular pore type.
This would indicate that an increase in gas porosity associated with an increased
purging gas flow rate in the LPBF process, may be due to increased gas entrapment
in the molten powder bed during the solidification process. This in combination with
the increased removal of process by-products, which would be associated with the
generation of lack of fusion porosity, therefore results in the decreased Irregular
proportion at higher gas flow rates.
It can also be seen through comparing plots 34(a), (b) and (c), that the Irregular pores
in the low gas flow build, are much larger than in the medium and high gas flow builds,
as well as have a more non spherical morphology. A possible explanation is that this
is associated with a considerable increase in lack of fusion porosity in the low gas flow
build, as this form of porosity tends to be the largest and most irregular. The average
Irregular pore diameter and surface area in the low gas flow is 0.449 mm and
0.338respectively, in comparison to that obtained at the high gas flow rate of
0.376 mm and 0.232 respectively.
5.5.2: GMM PLOT BASED ON MICROSCOPY DATA
5.5.2.1: OPTICAL MICROSCOPY IMAGES (275X)
Figure 36: GMM Results for Optical Microscopy Analysis data from (a) Low (b) Medium
(c) High Gas Flows at Magnification of 275X. Total of 134987 Pores
Figure 36 shows the GMM clustering results for the Optical Microscopy ImageJ
porosity analysis data, obtained at lower magnification (275X). Similarly, to section
5.5.1, two clusters were used as an input to the GMM model, as determined using the
silhouette score analysis. The average feature values for both clusters at each gas
flow are presented in table 7. At each gas flow rate, one cluster (the group labelled as
0 by the GMM model) contained pores that were far smaller and much more irregular
than the other cluster (the cluster labelled as 1 by the GMM model). Hence, it was
decided that cluster 0 would be referred to as the ‘Regular’ cluster group, and the other
group be referred to as the ‘Irregular’ cluster group.
Gas
Flow
Cluste
r
Area
[]
Circularit
y
Aspect
Ratio
Roundne
ss
Solidit
y
Label
Low
0
17
0.90
1.64
0.68
0.89
Regular
Low
1
2190
0.48
2.55
0.49
0.75
Irregular
Medium
0
7
0.93
1.77
0.70
0.92
Regular
Medium
1
487
0.43
2.06
0.54
0.66
Irregular
High
0
16
0.87
2.30
0.65
0.89
Regular
High
1
1103
0.51
1.91
0.60
0.89
Irregular
Table 7: Cluster Average Feature Values across each Gas Flow obtained from
microscopy images (Magnification = 275X)
Based upon the class labels decided upon through the values in table 7, the pore type
proportions of both Regular and Irregular clusters in each gas flow are presented in
table 8.
Gas Flow Rate
Regular Porosity
Proportion
Irregular Porosity
Proportion
Low (26 per hour)
74.3%
25.7%
Medium (31 per hour)
95.4%
4.6%
High (36 per hour)
93.4%
6.6%
Table 8: Pore Type proportion at each gas flow based on Optical Microscopy data
(M=275X). Total of 55 Samples
Like table 6, the low gas flow exhibits by far the lowest proportion of Regular porosity.
However, in contrast to table 6 the medium gas flow exhibits the largest proportion of
Regular porosity rather than the high gas flow, with a difference of 2% between
Regular pore type proportion. This is differing behaviour than that observed in the
GMM clustering of the XCT behaviour. It is interesting to note however that although
the Regular pore proportion in the low gas flow is comparable to the XCT low gas flow,
the Regular pore proportion in the Optical Microscopy medium and high builds was
10% and 3% higher than that obtained for samples analysed using XCT at the
corresponding gas flows.
5.5.2.2: OPTICAL MICROSCOPY MAGNIFICATION 624X
Figure 37: GMM Results for Optical Microscopy Analysis data from (a) Low (b) Medium
(c) High Gas Flows at Magnification of 624X. Total of 249,826 pores
Figure 37 shows the GMM plots for the increased magnification Optical Microscopy
data at each of the gas flow rates. Important to note in these graphs is the x axis, which
is an order of magnitude larger in the low gas flow compared to the high gas flow. The
only gas flow in this case to exhibit considerable amounts of Irregular porosity was the
low gas flow, with the medium and high gas flows only detecting 3 and 1 Irregular
pores respectively.
A consequence of the medium and high Irregular cluster being so small and skewed
towards large porosity is that the average area of these pores was considerably larger
than the Irregular cluster in the low gas flow rate. In both previous clustering
applications, the Irregular class in the low gas flow contained the largest porosity (by
average area) and this should be the case in this situation also due to the large amount
of extremely large-scale porosity in the low gas flow.
At each gas flow, two clusters were present. At each gas flow, the cluster group ‘0’
contained that were far smaller and much more regular in shape than the other cluster.
Consequently this cluster was referred to as the ‘Regular’ cluster in each gas flow,
while the other cluster was referred to as the ‘Irregular’ cluster.
The average feature values for the clusters in Figure 37 are given in table 9. The
average area of Irregular pores in the low gas flow is considerably reduced due to the
large amount of small-scale porosity relative to the other 2 gas flows, as evident in
Figure 37.
Gas
Flow
Cluster
Area
[]
Circularity
Aspect
Ratio
Roundness
Solidity
Label
Low
0
1.3
0.954
1.585
0.718
0.938
Regular
Low
1
275.0
0.454
5.549
0.312
0.737
Irregular
Mediu
m
0
2.7
0.914
1.793
0.675
0.904
Regular
Mediu
m
1
18154.7
0.178
2.237
0.531
0.612
Irregular
High
0
2.1
0.906
1.960
0.706
0.908
Regular
High
1
7841.9
0.348
1.215
0.823
0.874
Irregular
Table 9: Cluster Average Feature Values across each Gas Flow for Optical
Microscopy (Magnification = 624X)
In table 8, at each gas flow the cluster labelled by the GMM model as cluster 0
contained small and regular pores, evident through their reduced average area and
increased average circularity. It was consequently decided that these clusters would
be referred to as the ‘Regular’ cluster, and cluster 1 would be referred to as the
‘Irregular’ cluster.
Based upon the class labels decided upon through the values in table 9, the pore type
proportions at each gas flow are present in table 10. There is again an increase in the
proportion of Regular pores across all builds, with both the medium and high builds
containing 99%+ Regular type porosity, and the low gas flow Regular pore proportion
increasing by 14%.
Gas Flow Rate
Regular Porosity
Proportion
Irregular Porosity
Proportion
Low (26 per hour)
89.080%
10.920%
Medium (31 per hour)
99.997%
0.003%
High (36 per hour)
99.998%
0.002%
Table 10: Pore Type proportion at each gas flow in micrograph data (M=624X). Total
of 9 Samples
5.6: DISCUSSIONS AND CONCLUSIONS
The focus of this study was the evaluation of unsupervised machine learning to help
cluster the type of porosity present in the Ti-6Al-4V alloy parts. The GMM analysis
was carried out on both the XCT data, as well as on ImageJ data obtained from
cross sectional microscopy images of the alloy parts both at a similar magnification
as used for the XCT examination, as well as at a higher magnification. Porosity
analysis through XCT is limited by the resolution of this analysis technique. A further
limitation is the additional reduction in resolution, which results from the application
of the VGStudio Max PIA module to the XCT data, this limited the lowest resolution
to approx. 80 µm. In contrast significantly higher levels of porosity was identified
based on the use of the microscopy analysis data.
The unsupervised GMM analysis of XCT data demonstrated that Regularpores are
the most common pore type for all three gas flow rate builds. The parts printed at
the higher gas flow rate in addition to having the lowest porosity, also had the largest
proportion of Regular pores at 91%. In contrast, parts built under a low gas flow rate
had a high proportion of Irregular’ porosity, with the proportion of Regular pores at
80%. The pore size and Irregular shapes as demonstrated based on the microscopy
examination, indicating that the Irregular pores at low gas flow were largely
associated with LOF porosity, while at high gas flow were largely associated with
Key-Hole porosity.
Using the method of silhouette scores to search for the optimal number of natural
clusters present in the dataset, it was concluded that the optimal number of clusters
was 2, for all three gas flow rate build. The reason for this optimal number of 2 rather
than 3 clusters for the three builds investigated, may be that significant levels of both
LOF and Key-Hole pores did not simultaneously occur in parts from the same gas
flow rate build.
In summary, XCT has many advantages as a method for porosity analysis over that
of micrograph analysis as it is non-destructive, less time consuming, and gives
information on the entire component rather than an individual cross section. The
VGStudio data can be successfully inputted into the GMM unsupervised machine
learning technique, to quantify the type of porosity present. A difficulty however is
the instrument resolution, which is further limited in order to avoid system data
‘noise’ when using the VGStudio Max PIA module. Based on a comparison between
the micrograph and XCT analysis using GMM, this unsupervised machine learning
approach has been successfully applied as a method for the clustering of the
porosity distribution within alloy parts with varying types of porosity.
CHAPTER 6: SUPERVISED CLASSIFICATION AND EVALUATION
CHAPTER 6: SUPERVISED CLASSIFICATION AND EVALUATION
6.1: INTRODUCTION
This chapter details the results of an investigation of the use of supervised
classification algorithms, as a machine learning approach to differentiate between
Gas, Keyhole and Lack of Fusion pores. The investigation was carried out using the
same ImageJ porosity analysis data obtained from the micrographs datasets, which
were used to evaluate the unsupervised machine learning approach detailed in
Chapter 5. The eight algorithms studied were: K-Nearest Neighbour (KNN); Decision
Tree Classifiers (DTC); Naïve Bayes (NB); Support Vector Machine (SVM); Logistic
Regression (LR); Multi-Layer Perceptron (MLP); Extreme Gradient Boosted
Classification (XGB) and Gradient Boosted Classification (GB). These algorithms were
selected due to their previous application in the field of Additive Manufacturing, as
detailed in Chapter 1 (Table 1). All algorithms are implemented using their default
Scikit Learn parameters, in this study.
This chapter will initially discuss the datasets used in this study, followed by how
supervised classification machine learning is implemented. Then the four metrics used
to evaluate the algorithms performance are discussed. The results of the classification
study will then be presented, followed by a discussion and conclusions.
6.2: SUPERVISED CLASSIFICATION
6.2.1: DATASET DESCRIPTION
In this study, two separate datasets were used, the first was obtained from the low
(275X) magnification ImageJ porosity analysis dataset obtained from the sectioned Ti-
6Al-4V samples. This was also used in the unsupervised clustering study described in
Chapter 4. In total 55 samples from this sample set were examined, based on cross
sections of the printed alloy samples. This dataset will be referred to as the ‘Gas Flow’
dataset throughout this chapter. The XCT dataset was not evaluated in this study, as
it was considerably more difficult to develop a training dataset from the 3D images. A
second dataset referred to as the ‘Test Set’ throughout this chapter, was obtained from
a separate Ti-6Al-4V alloy printing study, which was carried out using the Renishaw
RenAM 500M system, as detailed in Chapter 3. The same sectioning methodologies
were applied to the cylinders printed for this study, as for the Gas Flow dataset, with
cross sections obtained for microscopy analysis (in this case for 16 samples). As
before an ImageJ porosity analysis dataset was obtained.
Each of the ImageJ porosity datasets contained five features, namely: Area;
Circularity; Roundness; Solidity; and Aspect Ratio. A description of these features is
available in chapter 3 ‘Optical Microscopy Examination’, a manually labelled column
called ‘Pore Type’ is also included. The Gas Flow dataset contains 1082 pores, while
the smaller ‘Test Set’ dataset contains 482 pores. Example cross section images from
each sample set is shown in Figure 38.
Figure 38: Examples cross section images obtained from the two sample sets referred
to in this chapter with the Gas Flow sample shown on the left and that obtained from
the Test Set sample set, shown on the right
The manual labelling of the ‘Gas Flow’ alloy cross sections was performed through a
visual examination of the microscopy images. Lack of fusion and keyhole pores were
readily identified due to their large cross-sectional area, while the distinguishing
characteristic between these morphologies was the more irregular cross sections
evident in the lack of fusion class. Gas type porosity was identifiable as smaller pores,
which was generally found to occurred in clusters within the alloy, in contrast to the
other two pore types.
6.2.2: CLASSIFICATION IMPLEMENTATION
For model evaluation in supervised learning, the model must be evaluated on data
that it has not yet seen, to avoid what is referred to as ‘peeking’ [100]. This is to avoid
the model being exposed to the value or category it is about to predict, while it is being
fit, as this can lead to overfitting.
In supervised classification analysis, in order to evaluate model performance, two
different methods can be used. The first method is known as ‘K-Fold Cross Validation’.
This method involves splitting the data into ‘K’ distinct sections or folds and fitting the
chosen model on K-1 folds [100]. The model is then tested on the Kth fold, and the
amount of incorrectly classified samples is recorded. This is then repeated K times,
using each fold, as a test set once. Schematically this is show in Figure 39, for K=5.
The error is then averaged over the K folds. The test set can be seen moving through
the entire dataset from left to right in this Figure. The advantage of this method is that
instead of using
as a test set for algorithm performance, the entire dataset acts as a
test set.
The second method is simply training the models upon one dataset, and then
evaluating them upon a second separate dataset. Both methods are used in this
thesis. 5-Fold Cross Validation is performed upon the Gas Flow Set, and then the
models will be trained upon the Gas Flow dataset and evaluated on the Test set. This
will be implemented twice, once for Gas/Non-Gas pores, and secondly for
Keyhole/Lack of Fusion pores.
Figure 39: 5-Fold Cross Validation Example showing how the 20% of the total data
(indicated in grey) acting as a test set, moves through each portion of the dataset [100]
6.3: CLASSIFICATION EVALUATION
For supervised classification models, model evaluation is relatively straightforward
compared with other machine learning techniques such as unsupervised clustering
models, due to the input vectors in supervised classification models having a label.
This label is known as the ‘Ground Truth’ label, and allows for model predictions to
be classified as correct or incorrect.
In this study, rather than a multiclass classification, two binary classifications are
made. The two classifications used for this study are Gas/Non-Gas, and Keyhole/Lack
of Fusion. This implies that each of the Gas Flow and Test sets will be tested twice.
Firstly, with each pore being labelled as Gas/Non-Gas (Keyhole and Lack of Fusion),
and then secondly this Non-Gas subset shall be evaluated for Keyhole/Lack of Fusion
pores. This shall result in 4 sets of results, two 5-fold cross validation results on the
Gas Flow Set, and two further evaluations on the Test Set.
Predictions made by the a classification model in the case of binary classification can
be encapsulated in a ‘Confusion Matrix’. An example of a Confusion Matrix (CM) is
given in Figure 40, where TP, FP, FN and TN correspond to True Positive, False
Positive, False Negative, and True Negative respectively [101]. This CM can
encapsulate every single prediction made by the classifier, by labelling one class as
truth, and one class as false. For example, if Lack of Fusion pores are labelled as true,
and Keyhole pores are labelled as false, then every correct prediction made by the
classifier shall exist in the green TP or TN sections of Figure 37. However incorrect
predictions shall exist in the red sections at FP or FN.
Figure 40: Example Confusion Matrix for a binary classification problem [101]
Different measures can be obtained from the confusion matrix, such as accuracy, false
positive rate and false negative rate [101]. These measures can be used in order to
calculate classification metrics such as the Accuracy; the F1-Score; the Recall (for a
specified class); and the Area Under Curve (AUC) Score. The following section
provides some background as to how each of these measures are calculated.
6.3.1: GENERAL ACCURACY
General accuracy is defined as “the number of correct predictions divided by the total
number of predictions”. It provides a value indicative of model performance in
successfully classifying samples (in this cases pores). Mathematically this can be
written as shown in Equation 11 [101].
 
 Equ. 11
Where TP, TN, FP, and FN correspond to the confusion matrix labels in Figure 40.
6.3.2: AREA UNDER CURVE (AUC) SCORE AND RECEIVER OPERATOR
CHARACTERISTIC (ROC) CURVE
In supervised classification, each model has what is called a probability threshold
[101]. When training the algorithm is completed, and a new sample is classified, it is
assigned a probability of belonging to the ‘positive’ or ‘truth’ class. If this probability
exceeds the probability threshold, it is labelled as a ‘positive’ class.
The total number of true positives, false positives, false negatives, and true negatives
are recorded for a classification models performance upon a test set. These measures
are used in order to compute a True Positive Rate (TPR) and a False Positive Rate
(FPR).
TPR is defined as the number of true positives correctly identified as true, or
mathematically by equation 12 [101]:
 
 Equ. 12
FPR is defined as the number of negative predictions that are incorrectly identified as
true, or mathematically by equation 13 [101]:
 
 Equ. 13
To combine the two TPR and FPR measures into a single metric, the TPR and FPR
are plotted for a series of different probability thresholds within the model. The
resulting graph is then known as the ROC curve, and both FPR and TPR must be
within the range 0 to 1 (inclusive of both 0 and 1).
The Area underneath the ROC curve is known as the AUC score for that model [101].
An ideal classifier has a TPR of 1, and an FPR of 0, regardless of probability thresholds
within the model. This results in an ideal AUC score of 1. Hence, the higher the AUC
score, the better the performance of the model at distinguishing between the classes.
Some example ROC curves are plotted in Figure 41, in comparison to what is
considered as a ‘no skill’ classifier (similar to that of flipping a coin, where the
probability threshold is 0.5). Models whose ROC curve approaches a TPR of 1 and an
FPR of 0 are deemed better classifiers. A high AUC score indicates a ‘robust’ model,
in that a randomly chosen positive sample is most likely indeed positive, regardless of
the probability threshold within the model.
Figure 41: ROC Curves for 3 example models compared with flipping a coin. Better
models are shown approaching the behaviour of a perfect classifier [101]
Probability threshold refers to the decision threshold value for converting a class
probability into a class label in classification problems, i.e. the probability that a pore
belongs to either the Gas or Non-Gas class in this study [101]. In classification
problems, the probability of a test case belonging to the positive class (e.g., Gas
porosity in this case), is known as the decision threshold value. In most cases, this is
0.5 by default, however in order to produce the new TPR and FPR values, this number
is varied from 0.01 to 0.99 in order to produce a series of TPR and FPR values, which
when plotted consequently produces an ROC curve as can be seen in Figure 41. This
measure shall be used in this study to give an indication of model performance, to
indicate how good the model is at distinguishing between classes. Specifically, how
good the algorithms distinguish between pore types in this study.
6.3.3: RECALL AND PRECISION SCORES
Recall can be defined as how accurately a model predicts a given class [134]. This
given class is also known as the ‘positive class’. In binary classification, it must be
stated which class was selected as the positive class, with the other class then being
referred to as the negative class. Therefore, the results presented for Recall are
presented for the positive class.
Recall is also known as True Positive Rate, is the proportion of actual positive cases
that were correctly identified as positive [134]. This is defined mathematically in
Equation 12 [101]. Recall scores closer to 1 indicate a model which consistently
identifies a selected class correctly. In this study, initially recall is selected for non-Gas
porosity, as these are more damaging to the mechanical performance of LPBF
produced components. In the classification results for Keyhole and Lack of Fusion
pores, Recall is selected for Lack of Fusion porosity, as this form of porosity is more
damaging than Keyhole porosity. Therefore, the Recall results presented in sections
6.4.1 and 6.4.3 are presented for how accurately the models classified Non-Gas
porosity, and the Recall results presented in section 6.4.2 and 6.4.4 are presented for
accurately the models classified Lack of Fusion porosity.
Precision is the ratio of true positives to the total false and true positives, i.e., how
accurate the model can differentiate between incorrect and correct classification of
positives [134]. Precision scores range from 0 to 1, and the closer the value is to 1 the
better, as a high Precision score indicates that the model missed very few positive
cases.
In this study as for the case of Recall, precision is selected for the non-Gas class in
sections 6.4.1 and 6.4.3, and for the Lack of Fusion class in sections 6.4.2 and 6.4.4.
Mathematically precision can be written as [134]:
 
 Equ. 14
6.3.4: F1 SCORES
To improve model performance, either Recall or Precision can be increased, but not
both [134]. Therefore, by increasing recall, precision would be reduced. Therefore, it
is common to use a combination of the two measures. The F1 score is defined as the
harmonic mean between the two measures, mathematically this can be described as
[134]:


Equ.15
A high F1 score indicates both a high precision, as well as a high recall. A low F1 score
can be due to either a low Precision, or a low Recall. If precision is low, of the cases
predicted as positive, not many of these predictions were correct. If Recall is low, this
indicates many of the actual positives were incorrectly classified as negatives.
F1 scores are a useful metric in order to assess model performance, however a low
F1 score indicates poor performance, and does not say much about whether the
source of the error is low precision or recall, hence when obtaining F1 scores it is
important to keep track of both the individual precision and recall scores also.
6.4: CLASSIFICATION RESULTS
The results section of this chapter is divided into four sections.
The first section will display the results for the 5-Fold cross validation
examination for Gas/Non-Gas classification on the Gas Flow dataset.
The second section will display the results for the 5-Fold cross validation
examination for Keyhole/Lack of Fusion classification on the Gas Flow dataset.
The third section will display the results for Gas/Non-Gas classification on the
Test dataset.
The fourth section will display the results for Keyhole/Lack of Fusion
classification on the Test dataset.
6.4.1: GAS / NON-GAS 5-FOLD CROSS VALIDATION
This study investigated the performance of the eight classification algorithms applied
to the ‘Gas Flow’ dataset through 5-fold cross validation. This was a binary
classification analysis and investigated the algorithm’s ability to classify Gas type
pores and Non-Gas type pores.
This Gas Flow dataset was obtained from a sectioning study of Ti-6Al-4V cylinders
produced at varying Argon gas flow rates. Recall here is selected for Non-Gas
porosity.
Model:
Accuracy:
AUC:
F1 Score:
Recall:
KNN
0.829±0.006
0.891±0.02
0.859±0.02
0.800±0.03
DTC
0.827±0.007
0.821±0.03
0.860±0.02
0.834±0.03
NB
0.810±0.001
0.879±0.02
0.832±0.02
0.727±0.03
SVM
0.754±0.03
0.886±0.02
0.782±0.05
0.778±0.02
LR
0.823±0.002
0.883±0.02
0.848±0.02
0.759±0.04
MLP
0.820±0.005
0.864±0.02
0.847±0.03
0.759±0.04
XGB
0.839±0.006
0.903±0.02
0.876±0.02
0.850±0.03
GB
0.824±0.007
0.896±0.02
0.860±0.02
0.823±0.04
Average:
0.816±0.008
0.878±0.022
0.846±0.024
0.792±0.052
Table 11: Performance metrics for classifier algorithms applied to Gas Flow Set for
Gas/Non-Gas Classification. Average Score in each metric presented for comparative
purposes. The results are provided for the eight algorithms studied were: K-Nearest
Neighbour (KNN); Decision Tree Classifiers (DTC); Naïve Bayes (NB); Support Vector
Machine (SVM); Logistic Regression (LR); Multi-Layer Perceptron (MLP); Extreme
Gradient Boosted Classification (XGB) and Gradient Boosted Classification (GB)
Table 11 shows the performance of each classifier on the Gas Flow Set using 5-fold
cross validation for Gas/Non-Gas classification. General accuracy refers to the total
number of correct predictions, while AUC refers to the classifiers ability to distinguish
between the two pore types. As detailed in section 6.3.3, Recall measures the
algorithm’s ability to accurately classify Non-Gas porosity, while F1 score measures
the algorithms ability to classify Non-Gas porosity correctly as well as distinguish
between the two classes. Each score contains error bars due to the fivefold cross
validation analysis.
Of the eight classifier algorithms investigated, the two Gradient Boosting algorithms
achieve the highest AUC value, followed by the KNN model. The Extreme Gradient
Boosting Model achieved the highest value in each of the four chosen metrics also,
indicating a high level of accuracy overall, while also demonstrating its ability to
classify non-Gas porosity in the Gas Flow Set, to a high degree of accuracy.
6.4.2: KEYHOLE/LACK OF FUSION 5-FOLD CROSS VALIDATION
This study investigated the performance of the eight classification algorithms applied
to the ‘Gas Flow’ dataset, through 5-fold cross validation. This was a binary
classification analysis and investigated the algorithm’s ability to classify Lack of Fusion
type pores and Keyhole type pores. As detailed earlier the Gas Flow dataset was
obtained from a sectioning study of Ti-6Al-4V cylinders produced at varying Argon gas
flow rates. Recall here is selected for Lack of Fusion porosity.
Model:
Accuracy:
AUC:
F1 Score:
Recall:
KNN
0.632±0.012
0.618±0.037
0.733±0.028
0.767±0.044
DTC
0.698±0.014
0.670±0.039
0.784±0.029
0.788±0.041
NB
0.528±0.002
0.565±0.038
0.532±0.041
0.396±0.042
SVM
0.678±0.000
0.597±0.080
0.807±0.026
1.000±0.000
LR
0.648±0.008
0.702±0.048
0.771±0.032
0.877±0.067
MLP
0.678±0.001
0.503±0.028
0.809±0.024
0.999±0.014
XGB
0.738±0.010
0.790±0.035
0.813±0.024
0.837±0.041
GB
0.759±0.011
0.801±0.031
0.827±0.025
0.864±0.040
Average:
0.670±0.007
0.656±0.042
0.759±0.029
0.816±0.036
Table 12: Performance metrics for classifier algorithms applied to Gas Flow set for
Keyhole/Lack of Fusion Classification. Average Score in each metric presented for
comparative purposes.
Table 12 shows the performance metrics of all eight classifiers on the Gas Flow
dataset 5-fold cross validation experiments. In contrast to table 11, the AUC, F1 Score,
and Recall are all specifically for Lack of Fusion (LOF) Porosity, rather than Keyhole
porosity. This approach was taken, as LOF porosity is the most detrimental form of
porosity, due to its highly irregular geometry promoting crack initiation at these pore
locations.
Similarly, to table 11, Extreme Gradient Boosting Model performed relatively well
across all metrics, with regular Gradient Boosting, in this case outperforming Extreme
Gradient Boosting in each metric. Both the Support Vector Machine and Multi-Layer
Perceptron obtained extremely high Recall values of 1.0 and 0.99 respectively,
however these came at the expense of lower-than-average AUC scores, indicating a
strong bias towards Lack of Fusion porosity in these algorithms. In this cross-validation
experiment, similarly to the Non-Gas/Gas experiment, Extreme Gradient Boosting
performed above average in all measures, however in this case it was outperformed
in each measure by regular Gradient Boosting.
6.4.3: GAS / NON-GAS TEST SET EVALUATION
This study investigated the performance of the eight classification algorithms on the
‘Test Set’ dataset, having been trained upon the ‘Gas Flow’ dataset. This binary
classification analysis investigated the algorithm’s ability to classify ‘Gas’ type pores,
and ‘Non-Gas’ type pores. Recall here is selected for Non-Gas porosity.
As detailed earlier this ‘Test Set’ was obtained from a highly porous sample set of Ti-
6Al-4V cylinders, which were sectioning and examined using the same approach used
to obtain the ‘Gas Flow’ dataset.
Model:
Accuracy:
AUC:
F1 Score:
Recall:
KNN
0.793
0.894
0.838
0.844
DTC
0.793
0.760
0.849
0.986
NB
0.778
0.881
0.884
0.834
SVM
0.778
0.859
0.864
0.771
LR
0.861
0.874
0.861
0.840
MLP
0.861
0.863
0.848
0.844
XGB
0.846
0.898
0.835
0.955
GB
0.846
0.909
0.847
0.934
Average:
0.819
0.867
0.853
0.876
Table 13: Performance metrics for classifier algorithms applied to Test Set for
Gas/Non-Gas Classification. Average Score in each metric presented for
comparative purposes.
The performance of each classifier on the Test Set data for Gas/Non-Gas
classification, is presented in table 12. It can be seen from this table that for each of
the four metric’s evaluated, a different classifier obtained the highest performance,
which contrasts with the Gas Flow Set, where Extreme Gradient Boosted Classifying
scored the highest across all metrics. However, Extreme Gradient Boosting still
records above average scores in Accuracy, AUC and Recall, but not for the F1 score,
indicating that the model tends to produce some false positives which classify gas
porosity, as non-gas porosity.
Figure 42 depicts the Receiver Operator Characteristics (ROC) curve for the test set
performance. The Decision Tree Classifiers (DTC) AUC score is visibly impacted by
its below average Sensitivity (True Positive Rate), in the interval [0.0 to 0.4].
Figure 42: Gas / Non-Gas Pore classification ROC Curve Test set. A higher ROC
curve indicates a larger AUC score and consequently a more accurate classification.
6.4.4: KEYHOLE/LACK OF FUSION TEST SET EVALUATION
This study investigated the performance of the eight classification algorithms on the
‘Test Set’ dataset, having been trained upon the ‘Gas Flow’ dataset. This binary
classification analysis investigated the algorithm’s ability to classify ‘Lack of Fusion’
type pores, and ‘Keyhole’ type pores. Recall here is selected for the Lack of Fusion
porosity.
The Test Set was obtained from a highly porous sample set of Ti-6Al-4V cylinders,
obtained through the same sectioning methodologies as the ‘Gas Flow’ dataset.
Model:
Accuracy:
AUC:
F1 Score:
Recall:
KNN
0.641
0.76
0.713
0.890
DTC
0.641
0.66
0.685
0.773
NB
0.644
0.74
0.701
0.632
SVM
0.644
0.46
0.667
1.000
LR
0.730
0.73
0.651
0.914
MLP
0.730
0.51
0.667
1.000
XGB
0.718
0.78
0.758
0.883
GB
0.702
0.79
0.741
0.853
Average:
0.681
0.72
0.698
0.868
Table 14: Performance metrics for classifier algorithms applied to test set for
Keyhole/Lack of Fusion Classification. Average Score in each metric presented for
comparative purposes.
Due to the smaller size of this dataset the LOF section was up sampled from 125
datapoints to 163. The Test Set dataset contained less pores than the Gas Flow
dataset, due to less samples being available in this sample set, and consequently
fewer cross sections. Up-sampling is a process whereby random samples from a
smaller dataset are duplicated to increase the size of the dataset [135].
Table 14 shows the performance of each classifier on the Test Set for Keyhole/Lack
of Fusion classification, as well as an average score for comparative purposes. In each
metric, a different classifier achieved highest performance. It is interesting to note, that
the two classifiers which achieved perfect Recall scores of 1, both had the two lowest
AUC curve scores, indicating these models had an extreme bias towards classifying
pores as Lack of Fusion pores, while poorly differentiating between the two classes.
For this study, as for the previous data sets investigated, the Extreme Gradient
Boosting classifier was the sole classifier to achieve above average scores for all four
metrics.
Figure 43 depicts the ROC curves for each classifier on the test set for Keyhole/Lack
of Fusion classification. Similar to the ROC curve in the Gas/Non-Gas Test Set
evaluation, the Decision Tree Classifier ROC curve, along with those of the Support
Vector Machine and the Multi-Layer Perceptron also produced poor ROC curves.
Figure 43: Lack of Fusion Pore classification ROC Curve Test set. A higher ROC curve
indicates a larger AUC score and consequently a more accurate classification
6.5: DISCUSSION AND CONCLUSIONS
This study evaluated the use of the supervised machine learning approaches to help
determine the types of porosity present in the Ti-6Al-4V alloy parts, based on cross
sectional analysis of parts using optical microscopy. The performance of the following
eight supervised machine learning algorithms were investigated for the classification
of the porosity: K-Nearest Neighbour (KNN); Decision Tree Classifiers (DTC); Naïve
Bayes (NB); Support Vector Machine (SVM); Logistic Regression (LR); Multi-Layer
Perceptron (MLP); Extreme Gradient Boosted Classification (XGB) and Gradient
Boosted Classification (GB).
In the Gas/Non-Gas cross validation evaluation based on the Gas Flow dataset, it was
found that the Extreme Gradient Boosting model achieved the best performance for
overall accuracy, AUC score, F1 score, and Recall. Recall was defined in this case for
non-Gas porosity. This implies that the Extreme Gradient Boosting model can not only
accurately identify non-Gas pores, but also accurately distinguish them from Gas
pores. The reason for the superior performance of this algorithm on this Gas Flow
dataset, may be due to the large difference in area between the two pore classes. With
Gas pores being far smaller than non-Gas pores, facilitating the distinction between
them.
Upon training the same eight models on the ‘Gas Flow’ dataset, and applying these
algorithms to the Test Set data for Gas/Non-Gas classification, it was found that in
comparison to the above cross validation analysis, that in each metric a different
classifier achieved the highest performance. The Extreme Gradient Boosting classifier
however was found to achieve above average performance for three of the four
metrics, excluding F1 score. This indicating that training upon a separate dataset
reduced the Precision score of this classifier, meaning that it classified some Gas
pores as non-Gas pores. The ROC curves for the classification algorithms in this case
were all similar in behaviour, except for the Decision Tree Classifier, which had a
poorer ROC curve than obtained for the other models.
In the second binary classification for Keyhole / Lack of Fusion pores, F1 scores and
Recall were defined for Lack of Fusion pores. In the cross-validation examination of
the Gas Flow dataset, it was found that the only two algorithms which performed above
average in each metric were the Gradient Boosted and Extreme Gradient Boosted
classifiers. However, in contrast to the initial cross validation examination, the regular
Gradient Boosting outperformed Extreme Gradient boosting. In this comparison it was
interesting to note that some classifiers such as the Support Vector Machine and Multi-
Layer Perceptron, exhibited an extreme bias towards classifying Keyhole pores as
Lack of Fusion, as indicated by their high Recall and low AUC scores.
The final test set involved an evaluation of Keyhole / Lack of Fusion porosity
classification provided some interesting results that mirrored the Gas Flow dataset
cross validation results. Again, the sole classifier to achieve above average results in
all metrics was the Extreme Gradient Boosted classifier, however the regular Gradient
Boosted Classifier did not perform well, in this case. Again, as in the cross-validation
results, the Support Vector Machine and Multi-Layer Perceptron both achieved
extremely high Recall scores, however lower AUC scores, this was also reflected in
their ROC curves. This indicates the same bias towards classifying Keyhole pores as
Lack of Fusion, which was evident in the cross-validation results for these algorithms,
is also found for the ‘Test Set’ evaluation. Through analysis of the eight algorithms
performance in both the cross validation experiments on the Gas Flow dataset, as well
as the ‘Test Set’ evaluations for both binary classification problems, the Extreme
Gradient Boosted classifier exhibited the highest scores of the eight models
investigated. Extreme Gradient Boosted classification achieved above average scores
(except for F1 Scores in the Gas / Non-Gas Test Set evaluation), for all four metric’s
for in both 5-fold cross validation experiments, along with both Test Set evaluation
experiments. As highlighted in the literature review, Extreme Gradient Boosted
classification is widely used for machine learning studies, due to its exceptional
performance in comparison to other classification algorithms. This was also evident in
this analysis of porosity type classification.
It was found that Gas / Non-Gas classification was more accurate, than Keyhole / Lack
of Fusion classification through 5-fold cross validation. This was evident in the
increased average accuracy, AUC, and F1 score, obtained for Gas / Non-Gas
classification. The corresponding average values for each classifier comparing
Keyhole / Lack of Fusion were much smaller as demonstrated in Table 15, except for
recall. Even though recall is larger in the Keyhole / Lack of Fusion classification, this
difference is marginal.
Metric
Gas/Non-Gas
Keyhole/Lack of Fusion
General Accuracy:
0.816
0.670
AUC Score:
0.878
0.656
F1-Score:
0.846
0.759
Recall:
0.792
0.816
Table 15: Average metric scores for all eight classification algorithms in both 5-fold cross
validation experiments obtained using the Gas Flow dataset. The values for the Gas/Non-Gas
classification are noticeably larger
A factor which is likely to help explain this the results in Table 15 is that the area feature
in the Non-Gas class is much larger on average than the Gas class in the initial
classification, due to Keyhole and Lack of Fusion pores being significantly larger than
Gas pores. In the Keyhole / Lack of Fusion classification analysis, the difference in
area between the classes is small.
In summary, the aim of this study was to investigate supervised classification
algorithm’s ability to differentiate between Gas pores and non-gas pores (Keyhole and
Lack of Fusion pores). It is concluded from that the Extreme Gradient Boosted
Classifier performed exceptionally well in differentiating between Gas and Non-Gas
pores. When it was subsequently applied to Keyhole and Lack of Fusion pores, it was
also very successful in distinguishing between these two pore types.
CHAPTER 7: CONCLUSIONS
CHAPTER 7: CONCLUSIONS
The aim of this thesis was to assess the impact that Argon gas flow during part printing
using the laser powder bed fusion process (LPBF), has on both the overall level and
type of porosity in Ti-6Al-4V alloy samples. The flow of argon gas was varied at three
levels in a production scale LPBF system (Renishaw RenAM 500M), in order to
evaluate its impact on porosity. Porosity measurement data was obtained based on
the scanning of printed parts using XCT, as well as optical microscopy examination of
sectioned alloy samples. A statistical unsupervised machine learning analysis was
carried out on the resulting datasets. The final section of this thesis involved the
comparison of the performance of eight supervised machine learning classifications
algorithms, to differentiate between the different types of porosity identified, based on
data obtained from optical microscopy (ImageJ) examination of alloy cross sections.
A summary of the main findings and conclusions made within this study are as follows:
The initial study focused on an investigation of the effect of Argon gas flow rate
during LPBF, on the level of porosity within Ti-6Al-4V samples. It was found
through XCT analysis (verified using Optical Microscopy analysis), that the build
produced at a lower gas flow rate of (26per hour), contained a significantly
increased levels of porosity, relative to that obtained at the medium (31per
hour) and high (36per hour), gas flow rates. It was concluded that the 0.21%
porosity generated at the low gas flow rate, is was most likely due to the lower
gas flow during printing removing fewer process by-products, than the case at
the higher gas flow rates. The deposition of condensate and spatter on the
powder bed prior to laser processing, results in an increased likelihood of
porosity formation. The overall level of porosity generated in the printed alloy
samples obtained at the medium gas flow at 0.03%, was similar to the 0.01%
obtained at the high gas flow rate. This indicated that this medium gas flow rate
is suitable for commercialised manufacturing, while minimising Ar gas usage,
helping to explain as to why it is the Renishaw recommended flow level for use
with the RenAM 500M system.
XCT analysis was used to provide contour plots, which detailed the average
detected porosity obtained for each cylinder, as plotted at its location on the
build plate. A relatively homogeneous distribution of porosity across the build
plates was demonstrated at the medium and high gas flow rates. At the low
gas flow build however, there were substantial variations in porosity across the
build plate. The contour plots indicating that there was higher porosity for
cylinders printed close to the Ar gas inlet, rather than away from it. This result
is surprising, as it is anticipated that with the deposition of process by-products
such as condensate, that higher levels of deposition would be obtained in the
print bed region away from the Ar gas inlet. There is no clear explanation for
this observation and one possibility would be to use gas flow modelling studies,
to help explain this result.
A comparison between the level of porosity obtained using the XCT and Optical
Microscopy analysis approaches, was carried out. While a direct comparison
between these two measurement techniques is not possible, due to the
differences in the alloy areas examined (i.e. one cross section versus bulk
sample), as well as the image resolutions of the two methods. Nevertheless,
the optical microscopy images (at a broadly similar magnification), consistently
observed an increased level of porosity, when compared with that obtained
using the XCT data. Using the low gas flow rate as an example, the overall
porosity level in the XCT data was 0.21%, in comparison to the optical
microscopy it was found to be 0.65%. The higher level obtained for the latter,
is most likely due to the lower minimum detectable diameter in the optical
microscopy of approx. 1.2 µm, in comparison to approx. 12.5 µm obtained using
XCT. Thus the ‘smaller’ porosity detected using optical microscopy contributes
to the overall increased level of porosity obtained using this technique.
While the use of micrograph analysis facilitated the detection of smaller pores
in comparison to XCT, the use of the latter technique for porosity evaluation, is
the strongly preferred approach experimentally. The process of sectioning
materials and polishing them in order to obtain micrograph images is a very
time-consuming process. It also only provides porosity information across
individual cross sections of the total part, in contrast XCT analysis is relatively
rapid, is non-destructive and provides information on porosity within the entire
sample.
In order to determine the optimal number of clusters present within both the
XCT and Micrograph data, Silhouette score calculations were found to be the
best method for determining the number of natural clusters present within each
dataset. Using this analysis approach, it was found that the optimal number of
clusters present within each of the three porosity datasets, was 2. Although
three classes of microstructural pore (gas, keyhole and lack of fusion) are
normally described for printed alloys in the literature, the optimal value being
determined as 2, may be due to the presence of limited quantities of keyhole
and lack of fusion porosity occurring simultaneously within the Ti-6Al-4V alloy
samples, obtained under the PLBF processing conditions investigated. This
was confirmed based on the microscopy cross sectional examination. The AM
processing conditions required for significant levels of the three types of pore
to form simultaneously in the build, do not appear to have been present under
the LPBF processing conditions investigated.
In order to understand what forms of porosity were more prevalent at varying
levels of Ar gas flow, GMM clustering was applied to the XCT and microscopy
(ImageJ) porosity data. Following average feature values of both clusters in
each dataset, it was determined that the terms ‘Regular’ and ‘Irregular’ could
be used to refer to the two cluster types. Regular porosity containing majority
gas type porosity, due to its regular shape and smaller geometry, and Irregular
porosity containing majority keyhole and lack of fusion porosity owing to its
irregular geometric features and large average area.
It was shown that Regular pores were the most common type of pore across all
three gas flows. It was also found that builds manufactured at higher gas flow
rates had higher Regular pore proportions than builds constructed at the low
gas flow rates. For example, using the XCT data analysis, it was found that
Regular pores made up 93.4% of the high gas flow build, in comparison to
74.3% in the low gas flow build. It was concluded after visual examination using
optical microscopy that most Irregular pores produced at the high gas flow rate
were keyhole, while most Irregular pores produced at the low gas flow rate were
lack of fusion pores. Table 16 shows the proportion of both clusters at each gas
flow rate, based on the XCT dataset.
Gas Flow Rate:
Regular Pores:
Irregular Pores:
Low (26):
74.3%
25.7%
Medium
(21):
95.4%
4.6%
High (36):
93.4%
6.6%
Table 16: Total proportion of both Regular and Irregular clusters at each gas flow within
the Gas Flow dataset, based on GMM clustering
Eight supervised classification algorithms were investigated, to compare their
ability to accurately classify the three forms of porosity obtained in Ti-6Al-4V
obtained by additive manufacturing. These were K-Nearest Neighbour;
Decision Tree Classifiers; Naïve Bayes; Support Vector Machine; Logistic
Regression; Multi-Layer Perceptron; Extreme Gradient Boosted Classification
and Gradient Boosted Classification. Each was applied to the optical
microscopy ImageJ Porosity datasets obtained at high and low magnifications
from the Gas Flow test sets, which had been manually labelled. In place of a
multiclass classification two-step binary classification was used. It was found
that Gas/Non-Gas classification was more accurate, than Keyhole/Lack of
Fusion classification through 5-fold cross validation. This was evident in the
increased average accuracy, AUC, and F1 score obtained for Gas / Non-Gas
classification. The corresponding average values for each classifier comparing
Keyhole / Lack of Fusion were much smaller as demonstrated in Table 15,
except for recall. Even though recall is larger in the Keyhole / Lack of Fusion
classification, this difference is marginal.
Metric
Gas/Non-Gas
Keyhole/Lack of Fusion
General Accuracy:
0.816
0.670
AUC Score:
0.878
0.656
F1-Score:
0.846
0.759
Recall:
0.792
0.816
Table 17: Average metric scores for all eight classification algorithms in both 5-
fold cross validation experiments upon the Gas Flow dataset. The values for the
Gas/Non-Gas classification are noticeably larger
A factor which is likely to help explain these results is that the area feature in
the Non-Gas class is much larger on average, than obtained for the Gas class
in the initial classification, due to Keyhole and Lack of Fusion pores being
significantly larger than Gas pores. In the Keyhole/Lack of Fusion classification
analysis, the difference in area between the classes is small.
The eight supervised classification algorithms were compared using their
default parameters. The best performing classifiers based on the examination
of the ‘Test Set’, dataset, were found to be Gradient Boosting, and Extreme
Gradient Boosting, as they each consistently achieving above average scores
in each metric. Extreme Gradient Boosting also performed above average in
each metric for both test set evaluations, except for F1 Score in gas/non-gas
classification, indicating a reduced precision in this instance for non-gas/gas
porosity.
Both the Gradient Boosting model, and the Extreme Gradient Boosting model
achieved the highest AUC scores of all eight models in both Test Set
evaluations, indicating that these models were able to most accurately separate
both pore types in each case. The fact that these models had the best AUC
scores, while also obtaining above average scores in each metric is also
demonstrated. Some models, such as the Multi-Layer Perceptron model and
the Support Vector Machine achieved maximum Recall scores in the
Keyhole/Lack of Fusion Test Set, however the associated AUC scores were
only 0.48 and 0.51 respectively. This indicates that they classed every single
pore in the study as being associated with Lack of Fusion, hence the perfect
recall. This demonstrates an extremely poor analysis performance for both of
these models, despite the maximum Recall values.
In summary therefore the Gradient Boosting, and Extreme Gradient Boosting
were the best performing models in this study, exhibiting above average
behaviour in each of the four metrics: Accuracy; AUC; F1 Score; and Recall.
These two models also performed well in both the Gas Flow dataset, cross
validation experiments.
Both supervised classification, as well as unsupervised clustering have
difficulties associated with their implementation. Supervised classification
required an initial manual labelling of pores into three independent classes,
which proved to be time consuming. Determining the number of clusters for the
unsupervised clustering proved difficult also, as four separate statistical
techniques had to be implemented. However, the unsupervised clustering
analysis may prove more useful going forward, as it requires no user knowledge
on porosity prior to implementation, in comparison to supervised classification
which requires a large amount of user input.
In summary, two separate machine learning model approaches were applied to
porosity analysis data, obtained through XCT and micrograph analysis of Ti-
6Al-4V alloy samples. An unsupervised clustering model known as the
Gaussian Mixture Model (GMM), as well as eight supervised classification
models, of which the best performing model was the Extreme Gradient Boosting
Classifier (XGB), were implemented.
Two clusters were identified in each dataset, and GMM clustering was applied.
It was found that the proportion of pores within the ‘Irregular’ cluster in each
dataset increased with decreasing Argon gas flow rate. These Irregular pores
were most likely Lack of Fusion pores, or Keyhole pores.
Extreme Gradient Boosting was identified as the optimal classification algorithm
of the eight implemented on the micrograph porosity dataset. This was due to
its above average result in each of the four classification metrics (general
accuracy, AUC score, F1 score, recall), in both binary classification analyses
on the ‘Test’ dataset. The XGB model also achieved above average results in
each metric in both 5-fold cross validation analyses on the Gas Flow dataset.
7.1 FUTURE WORK
Based on the results from this MSc thesis amongst the potential areas for future
research are:
Process parameters - Within this work, the only additive manufacturing process
parameter varied was that of the Argon gas flow rate. And although the effects of most
process parameters on overall porosity fraction are well studied in the literature, the
effects of process parameters upon the type of porosity, is an area that requires further
study. This could potentially be investigated through a similar method to this thesis, in
XCT scanning components at identical build locations, and sectioning the same
components for ImageJ optical microscopy porosity analysis.
Supervised Approaches - The eight supervised classification algorithms that were
investigated were analysed using their default scikit-learn parameters. In practise, the
performance of a machine learning model can be improved through optimising the
parameters of that model. A more in-depth analysis of their performance could be
obtained through parameter optimisation upon the Gas Flow dataset. This may yield
a more accurate assessment of each models maximum accuracy in the classification
of porosity data obtained from optical microscopy images.
ML model optimisation - As the level of porosity data obtained through optical
microscopy is far larger than that obtained through XCT analysis, the pore class types
across each gas flow could be assessed using the Extreme Gradient Boosting
Classifier. This would be achieved by a labelled subset of the Gas Flow dataset being
used as a training set, and the entire microscopy dataset being used as the test set.
This would be similar to the Regular pore cluster proportion at each gas flow being
analysed through clustering, however this would be a class obtained through
supervised methods. This could yield a more accurate pore type proportion than the
XCT analysis, due to the minimum detectable diameter in optical microscopy being
much less than that obtained using XCT.
APPENDIX A
This appendix provides the BIC scores for each dataset in this study (both
microscopy magnifications and the XCT dataset) at each gas flow rate. This results
in nine graphs in total, with each graph containing the BIC score plotted at cluster
numbers from two to nine. This is plotted for each of the covariance types, as
described in section 5.2.
BIC SCORES
In this three-by-three grid each column represents a gas flow, and each row represents
a data source. Each of the nine graphs, contain four individual line plots for each of
the covariance types. It can be seen from Figure 44 that no clear BIC minimum is
reached in any of the nine graphs, with each containing line plots that appear to
converge to some minimum far beyond the chosen range of clusters. It can also be
seen that in each case, the full covariance GMM produced the minimum BIC line
graph, while spherical covariance produced the largest BIC cluster line graph.
Therefore, as this method does not provide a clear indication of cluster number, a new
method was implemented. However, this result does indicate that full covariance is the
optimal covariance type. Therefore, full covariance shall be implemented in the GMM
implementation section 5.5.
Figure 44: BIC Scores vs Cluster Number for each GMM Covariance matrix. Each
column corresponding to a Gas Flow rate, which each row corresponding to XCT data,
or the two Optical Microscopy data sources
REFERENCES
[1] King, W.E., Anderson, A.T., Ferencz, R.M., Hodge, N.E., Kamath, C., Khairallah,
S.A. and Rubenchik, A.M., 2015. Laser powder bed fusion additive manufacturing of
metals; physics, computational, and materials challenges. Applied Physics
Reviews, 2(4), p.041304.
[2] M. Javaid. Current status and challenges of additive manufacturing in orthopaedics:
An overview. Journal of Clinical Orthopaedics and Trauma, 10:380386, 2019.
[3] A. H. Mohd. Javaid. Role of ct and mri in the design and development of
orthopaedic model using additive manufacturing. Journal of Clinical Orthopaedics and
Trauma, 9:213217, 2018.
[4] W. J. James, M. A. Slabbekoorn, W. A. Edgin, and C. K. Hardin, “Correction of
congenital malar hypoplasia using stereolithography for presurgical planning,” Journal
of Oral and Maxillofacial Surgery, vol. 56, no. 4, pp. 512517, 1998.
[5] G. Fielding, A. Bandyopadhyay, and B. Susmita, “Effects of silica and zinc oxide
doping on mechanical and biological properties of 3D printed tricalcium phosphate
tissue engineering scaffolds,” Dental Materials, vol. 28, no. 2, pp. 113–122, 2012.
[6] I. Gibson, T. Kvan, and W. Ling, “Rapid prototyping for architectural models,” Rapid
Prototyping Journal, vol. 8, no. 2, pp. 9199, 2002.
[7] R. van Noort, “The future of dental devices is digital,” Dental Materials, vol. 28, no.
1, pp. 312, 2012.
[8] K. U. Bletzinger and E. Ramm, “Structural optimization and form finding of light
weight structures,” Computers and Structures, vol. 79, no. 2225, pp. 20532062,
2001.
[9] A.A. Shapiro, et al.Additive manufacturing for aerospace flight applications, J.
Spacecraft Rockets, 53 (5) (2016), pp. 952-959
[10] R. Kennedy, Outside the Box: How GE Aviation Entered the Brave New World of Additive
Manufacturing, GE Avitation, https://blog.geaviation.com/manufacturing/outside-the-
box-how-ge-aviation-entered-the-brave-new-world-of-additive-manufacturing/,
Accessed on 31/05/2022
[11] P. Picariello. Committee f42 on additive manufacturing technologies
[12] Jiang, J., Xu, X. and Stringer, J., 2018, December. A new support strategy for
reducing waste in additive manufacturing. In The 48th international conference on
computers and industrial engineering (CIE 48) (pp. 1-7).
[13] King, W.E., Anderson, A.T., Ferencz, R.M., Hodge, N.E., Kamath, C., Khairallah,
S.A. and Rubenchik, A.M., 2015. Laser powder bed fusion additive manufacturing of
metals; physics, computational, and materials challenges. Applied Physics
Reviews, 2(4), p.041304.
[14] Matthews, M.J., Guss, G., Khairallah, S.A., Rubenchik, A.M., Depond, P.J. and
King, W.E., 2017. Denudation of metal powder layers in laser powder-bed fusion
processes. In Additive Manufacturing Handbook (pp. 677-692). CRC Press.
[15] Shim, D.S., Baek, G.Y., Seo, J.S., Shin, G.Y., Kim, K.P. and Lee, K.Y., 2016.
Effect of layer thickness setting on deposition characteristics in direct energy
deposition (DED) process. Optics & Laser Technology, 86, pp.69-78.
[16] Javidani, M., Arreguin-Zavala, J., Danovitch, J., Tian, Y. and Brochu, M., 2017.
Additive manufacturing of AlSi10Mg alloy using direct energy deposition:
microstructure and hardness characterization. Journal of Thermal Spray
Technology, 26(4), pp.587-597.
[17] W. E. King. Laser powder bed fusion additive manufacturing of metals; physics,
computational, and materials challenges. Applied Physics Reviews, 2:041304, 2015.
[18] Tyralla, D. and Seefeld, T., 2021. Thermal based process monitoring for laser
powder bed fusion (LPBF). In Advanced Materials Research (Vol. 1161, pp. 123-130).
Trans Tech Publications Ltd.
[19] Imani, F., Gaikwad, A., Montazeri, M., Rao, P., Yang, H. and Reutzel, E., 2018.
Process mapping and in-process monitoring of porosity in laser powder bed fusion
using layerwise optical imaging. Journal of Manufacturing Science and
Engineering, 140(10).
[20] Cost and practicality of in-process monitoring for metal additive manufacturing,
https://www.metal-am.com/articles/cost-and-practicality-of-in-process-monitoring-for-
metal-3d-printing/, Accessed on 23/06/2021
[21] Lu, Y. and Wang, Y., 2021. Physics based compressive sensing to monitor
temperature and melt flow in laser powder bed fusion. Additive Manufacturing, 47,
p.102304.
[22] Brochure: InfiniAM Spectral Energy input and melt pool emissions monitoring
for AM systems, Renishaw,
https://www.renishaw.com/resourcecentre/en/details/(83304737-0e0f-4fff-a678-
eadf065820b4), Accessed on 31/05/2022
[23] InfiniAM Central, Renishaw, https://www.renishaw.com/en/infiniam-central--
39816, Accessed on 31/05/2022
[26] Jhabvala, Jamasp & Boillat, Eric & Antignac, Thibaud & Glardon, Rémy. (2010).
On the effect of scanning strategies in the selective laser melting process. Virtual and
Physical Prototyping. 5. 10.1080/17452751003688368.
[27] Robinson, J., Ashton, I., Fox, P., Jones, E. and Sutcliffe, C., 2018. Determination
of the effect of scan strategy on residual stress in laser powder bed fusion additive
manufacturing. Additive Manufacturing, 23, pp.13-24.
[24] Darragh S. Egan, Caitríona M. Ryan, Andrew C. Parnell, Denis P. Dowling, Using
in-situ process monitoring data to identify defective layers in Ti-6Al-4V additively
manufactured porous biomaterials, Journal of Manufacturing Processes, Volume 64,
2021, Pages 1248-1254, ISSN 1526-6125,
https://doi.org/10.1016/j.jmapro.2021.03.002,
(https://www.sciencedirect.com/science/article/pii/S1526612521001675)
[25] Sola, A. and Nouri, A., 2019. Microstructural porosity in additive manufacturing:
The formation and detection of pores in metal parts fabricated by powder bed
fusion. Journal of Advanced Manufacturing and Processing, 1(3), p.e10021.
[28] Selective laser melting (slm): 3d printing simply explained.
https://all3dp.com/2/selective-laser-melting-slm-3d-printing-simplyexplained/:
:text=SLM
[29] N. T. Aboulkhair. Reducing porosity in alsi10mg parts processed by selective laser
melting. Additive Manufacturing, 1-4:7786, 2014.
[30] S. Erol, A. Jaeger, P. Hold, K. Ott, and W. Sihn. Tangible industry 4.0: A scenario-
based approach to learning for the future of production. Procedia CIRP, 54:1318,
2016.
[31] Ferro, P., Meneghello, R., Savio, G. and Berto, F., 2020. A modified volumetric
energy densitybased approach for porosity assessment in additive manufacturing
process design. The International Journal of Advanced Manufacturing
Technology, 110(7), pp.1911-1921.
[32] S. L. Sing. Selective laser melting of titanium alloy with 50wt% tantalum: Effect of
laser process parameters on part quality. International Journal of Refractory Metals
and Hard Materials, 77:120127, 2018.
[33] W. Shi. Beam diameter dependence of performance in thick-layer and high-power
selective laser melting of ti-6al-4v. Materials, 11:1237, 2018.
[34] Shane Keaveney, Aleksey Shmeliov, Valeria Nicolosi, Denis P. Dowling,
Investigation of process by-products during the Selective Laser Melting of Ti6AL4V
powder, Additive Manufacturing, Volume 36, 2020, 101514, ISSN 2214-8604,
https://doi.org/10.1016/j.addma.2020.101514,
(https://www.sciencedirect.com/science/article/pii/S2214860420308861), Melt pool
emissions; Process monitoring; Powder bed fusion
[35] Montazeri, M. et al. (2018) ‘In−process monitoring of material
cross−contamination defects in laser powder bed fusion’, Journal of Manufacturing
Science and Engineering, Transactions of the ASME. American Society of Mechanical
Engineers (ASME), 140(11). doi: 10.1115/1.4040543.
[36] Amir Mostafaei,, Defects and anomalies in powder bed fusion metal additive
manufacturing, Current Opinion in Solid State and Materials Science, Volume 26,
Issue 2, 2022, 100974, ISSN 1359-0286,
https://doi.org/10.1016/j.cossms.2021.100974,
(https://www.sciencedirect.com/science/article/pii/S1359028621000772)
[37] M. Javaid, A. Haleem, Additive manufacturing applications in medical cases : A
literature based review, Alexandria J. Med. 54 (2018).
[38] J. Reijonen, A. Revuelta, T. Riipinen, K. Ruusuvuori, P. Puukko, On the effect of
shielding gas flow on porosity and melt pool geometry in laser powder bed fusion
additive manufacturing, Addit. Manuf. 32 (2020) 101030.
doi:10.1016/j.addma.2019.101030.
[39] Sun, Z., Tan, X., Wang, C., Descoins, M., Mangelinck, D., Tor, S.B., Jägle, E.A.,
Zaefferer, S. and Raabe, D., 2021. Reducing hot tearing by grain boundary
segregation engineering in additive manufacturing: example of an AlxCoCrFeNi high-
entropy alloy. Acta Materialia, 204, p.116505.
[40] A. Bin Anwar, Q.C. Pham, Study of the spatter distribution on the powder bed
during selective laser melting, Addit. Manuf. 22 (2018) 8697.
doi:10.1016/j.addma.2018.04.036.
[41] Sola, A. and Nouri, A. (2019) ‘Microstructural porosity in additive manufacturing:
The formation and detection of pores in metal parts fabricated by powder bed fusion’,
Journal of Advanced Manufacturing and Processing, 1(3), pp. 121. doi:
10.1002/amp2.10021.
[42] Essa, K., Jamshidi, P., Zou, J., Attallah, M.M. and Hassanin, H., 2018. Porosity
control in 316L stainless steel using cold and hot isostatic pressing. Materials &
Design, 138, pp.21-29.
[43] Xiong, Y., Han, Z., Qin, J., Dong, L., Zhang, H., Wang, Y., Chen, H. and Li, X.,
2021. Effects of porosity gradient pattern on mechanical performance of additive
manufactured Ti-6Al-4V functionally graded porous structure. Materials &
Design, 208, p.109911.
[44] Anton du Plessis, Effects of process parameters on porosity in laser powder bed
fusion revealed by X-ray tomography, Additive Manufacturing, Volume 30, 2019,
100871, ISSN 2214-8604, https://doi.org/10.1016/j.addma.2019.100871,
(https://www.sciencedirect.com/science/article/pii/S2214860419306979)
[45] M. Tang, P.C. Pistorius, J.L. Beuth, Prediction of lack-of-fusion porosity for
powder bed fusion, Addit. Manuf., 14 (2017), pp. 39-
48, 10.1016/j.addma.2016.12.001
[46]
W.E. King, H.D. Barth, V.M. Castillo, G.F. Gallegos, J.W. Gibbs, D.E. Hahn, C. Kam
ath, A.M. Rubenchik, Observation of keyhole-mode laser melting in laser powder-bed
fusion additive manufacturingJ.Mater.Process.Technol., 214 (2014),pp. 2915-
2925, 10.1016/j.jmatprotec.2014.06.005
[47]
T. Debroy, H.L. Wei, J.S. Zuback, T. Mukherjee, J.W. Elmer, J.O. Milewski, A.M. Bee
se, A. Wilson-Heid, A. De, W. Zhang, Additive manufacturing of metallic components
process, structure and properties, Prog. Mater. Sci., 92 (2018), pp. 112-
224, 10.1016/j.pmatsci.2017.10.001
[48] Ming Tang, P. Chris Pistorius, Jack L. Beuth, Prediction of lack-of-fusion porosity
for powder bed fusion, Additive Manufacturing, Volume 14, 2017, Pages 39-48, ISSN
2214-8604, https://doi.org/10.1016/j.addma.2016.12.001.,
https://www.sciencedirect.com/science/article/pii/S2214860416300471
[49] Zackary Snow, Abdalla R. Nassar, Edward W. Reutzel, Invited Review Article:
Review of the formation and impact of flaws in powder bed fusion additive
manufacturing, Additive Manufacturing, Volume 36, 2020, 101457, ISSN 2214-8604,
https://doi.org/10.1016/j.addma.2020.101457,
(https://www.sciencedirect.com/science/article/pii/S2214860420308290)
[50] H. Gong, K. Rafi, H. Gu, T. Starr, B. Stucker, Analysis of defect generation in Ti-
6Al-4V parts made using powder bed fusion additive manufacturing processes Addit.
Manuf., 1 (2014), pp. 87-98, 10.1016/j.addma.2014.08.002
[51] N.T. Aboulkhair, N.M. Everitt, I. Ashcroft, C. Tuck, Reducing porosity in AlSi10Mg
parts processed by selective laser melting, Addit. Manuf., 1 (2014), pp. 77-
86, 10.1016/j.addma.2014.08.001
[52] M. Seifi, A. Salem, J. Beuth, O. Harrysson, J.J. Lewandowski, Overview of
materials qualification needs for metal additive manufacturing, JOM, 68 (3) (2016),
pp. 747-764, 10.1007/s11837-015-1810-0
[53] Al-Maharma, A.Y., Patil, S.P. and Markert, B., 2020. Effects of porosity on the
mechanical properties of additively manufactured components: A critical
review. Materials Research Express, 7(12), p.122001.
[54] Anton du Plessis, Ina Yadroitsava, Stephan G. le Roux, Igor Yadroitsev, Johannes
Fieres, Christof Reinhart, Pierre Rossouw, Prediction of mechanical performance of
Ti6Al4V cast alloy based on microCT-based load simulation, Journal of Alloys and
Compounds, Volume 724, 2017, Pages 267-274, ISSN 0925-8388,
https://doi.org/10.1016/j.jallcom.2017.06.320.,
(https://www.sciencedirect.com/science/article/pii/S0925838817323289)
[55] Phutela C, Aboulkhair NT, Tuck CJ, Ashcroft I. The Effects of Feature Sizes in
Selectively Laser Melted Ti-6Al-4V Parts on the Validity of Optimised Process
Parameters. Materials (Basel). 2019 Dec 26;13(1):117. doi: 10.3390/ma13010117.
PMID: 31887981; PMCID: PMC6982097.
[56] S.N. Aqida, EFFECTS OF POROSITY ON MECHANICAL PROPERTIES OF
METAL MATRIX COMPOSITE: AN OVERVIEW, Jurnal Teknologi, 40(A) Jun. 2004:
1732
[57] Arana, M., Ukar, E., Rodriguez, I., Iturrioz, A. and Alvarez, P., 2021. Strategies to
reduce porosity in Al-Mg WAAM parts and their impact on mechanical
properties. Metals, 11(3), p.524.
[58] Liu, R., Liu, H., Sha, F., Yang, H., Zhang, Q., Shi, S. and Zheng, Z., 2018.
Investigation of the porosity distribution, permeability, and mechanical performance of
pervious concretes. Processes, 6(7), p.78.
[59] Torres-Sanchez, C., Norrito, M., Almushref, F.R. and Conway, P.P., 2021. The
impact of multimodal pore size considered independently from porosity on mechanical
performance and osteogenic behaviour of titanium scaffolds. Materials Science and
Engineering: C, 124, p.112026.
[60] Ferrar, B. et al. (2012) ‘Gas flow effects on selective laser melting (SLM)
manufacturing performance’, Journal of Materials Processing Technology, 212(2), pp.
355364. doi: 10.1016/j.jmatprotec.2011.09.020.
[61] Ladewig, A. et al. (2016) ‘Influence of the shielding gas flow on the removal of
process byproducts in the selective laser melting process’, Additive Manufacturing, 10,
pp. 19. doi: 10.1016/j.addma.2016.01.004.
[62] Pauzon, C. et al. (2021) ‘Control of residual oxygen of the process atmosphere
during laserpowder bed fusion processing of Ti-6Al-4V’, Additive Manufacturing, 38,
p. 101765. doi: https://doi.org/10.1016/j.addma.2020.101765.
[63] S. Kumar, 10.05 - Selective Laser Sintering/Melting, Editor(s): Saleem Hashmi,
Gilmar Ferreira Batalha, Chester J. Van Tyne, Bekir Yilbas, Comprehensive Materials
Processing, Elsevier, 2014, Pages 93-134, ISBN 9780080965338,
https://doi.org/10.1016/B978-0-08-096532-1.01003-7.,
https://www.sciencedirect.com/science/article/pii/B9780080965321010037
[64] Chen, Z. et al. (2020) ‘Process variation in Selective Laser Melting of Ti-6Al-4V
alloy’, MATEC Web of Conferences. Edited by P. Villechaise et al., 321, p. 03024. doi:
10.1051/matecconf/202032103024.
[65] Meng, L., McWilliams, B., Jarosinski, W. et al. Machine Learning in Additive
Manufacturing: A Review. JOM 72, 23632377 (2020).
https://doi.org/10.1007/s11837-020-04155-y
[66] Industry 4.0 and the fourth industrial revolution explained, i-scoop, https://www.i-
scoop.eu/industry-4-
0/#:~:text=Industry%204.0%20has%20been%20defined,and%20creating%20the%2
0smart%20factory%E2%80%9D., accessed on 31/05/2022
[67] E. Alpaydin, Introduction to Machine Learning, 3rd ed. (London: The MIT Press,
2009), p. 3.
[68] M.I. Jordan, T.M. Mitchell, Machine learning: trends, perspectives, and
prospectsScience., 349 (6245) (2015), pp. 255-260
[69] M. Kuderer, S. Gulati, W. Burgard, Learning driving styles for autonomous
vehicles from demonstration
IEEE International Conference on Robotics and Automation, IEEE (2015), pp. 2641-
2646
[70] J. Barnes, C. Cummings, Machine Learning and Additive Manufacturing: What
does the future hold?, Metal AM, https://www.metal-am.com/articles/machine-
learning-and-additive-manufacturing-what-does-the-future-hold/, Accessed on
31/05/2022
[71] W.E. Frazier, J. Mater. Eng. Perform. 23, 1917 (2014)
[72] W.J. Sames, F. List, S. Pannala, R.R. Dehoff, and S.S. Babu, Int. Mater. Rev. 61,
315 (2016).
[73] S.K. Everton, M. Hirsch, P. Stravroulakis, R.K. Leach, and A.T. Clare, Mater.
Des. 95, 431 (2016).
[74] T. Wang, T.-H. Kwok, C. Zhou, and S. Vader, J. Manuf. Syst. 47, 83 (2018).
[75] M. Khanzadeh, S. Chowdhury, M. Marufuzzaman, M.A. Tschopp, and L. Bian, J.
Manuf. Syst. 47, 69 (2018).
[76] Jung, Y. M., Whang, J. J. and Yun, S. (2020) ‘Sparse probabilistic K-means’,
Applied Mathematics and Computation, 382, p. 125328. doi:
10.1016/j.amc.2020.125328.
[77] Rani, Y. and Rohil, D. H. (2013) ‘A Study of Hierarchical Clustering Algorithm’, in
[78] Ahmad, P. H. (2015) ‘Performance Evaluation of Clustering Algorithm Using
Different Datasets’, IJARCSMS, 5(1), pp. 167–173.
[79] Sinaga, K. P. and Yang, M. (2020) ‘Unsupervised K-Means Clustering Algorithm’,
IEEE Access, 8, pp. 8071680727. doi: 10.1109/ACCESS.2020.2988796
[80] Unsupervised Learning: Slgorithms and Examples, AltextSoft.com,
https://www.altexsoft.com/blog/unsupervised-machine-learning/, Accessed on
12/05/2022
[81] Suárez, J. L., García, S. and Herrera, F. (2021) ‘A tutorial on distance metric
learning: Mathematical foundations, algorithms, experimental analysis, prospects and
challenges’, Neurocomputing, 425, pp. 300322. doi: 10.1016/j.neucom.2020.08.017.
[82] Rani, Y. and Rohil, D. H. (2013) ‘A Study of Hierarchical Clustering Algorithm’, in.
[83] Sinaga, K. P. and Yang, M. (2020) ‘Unsupervised K-Means Clustering Algorithm’,
IEEE Access, 8, pp. 8071680727. doi: 10.1109/ACCESS.2020.2988796.
[84] Dasgupta, A. and Wahed, A. (2014) ‘Chapter 4 - Laboratory Statistics and Quality
Control’, in Dasgupta, A. and Wahed, A. (eds) Clinical Chemistry, Immunology and
Laboratory Quality Control. San Diego: Elsevier, pp. 4766. doi:
https://doi.org/10.1016/B978-0-12- 407821-5.00004-8.
[85] McLachlan, G.J. and Rathnayake, S., 2014. On the number of components in a
Gaussian mixture model. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery, 4(5), pp.341-355.
[86] Das, A. et al. (2019) ‘Deep learning based liver cancer detection using watershed
transform and Gaussian mixture model techniques’, Cognitive Systems Research, 54,
pp. 165175. doi: https://doi.org/10.1016/j.cogsys.2018.12.009.
[87] P. Bühlmann and S. Van De Geer, Statistics for High-Dimensional Data: Methods,
Theory and Applications, 1st ed. (Springer : Berlin, Germany, 2011).
[88] M. Khanzadeh, S. Chowdhury, M.A. Tschopp, H.R. Doude, M. Marufuzzaman,
and L. Bian, IISE Transactions 51, 437 (2019).
[89] H. Wu, Z. Yu, and Y. Wang, Measurement 136, 445 (2019).
[90] Zhao, X., Imandoust, A., Khanzadeh, M., Imani, F. and Bian, L., 2021. Automated
Anomaly Detection of Laser-Based Additive Manufacturing Using Melt Pool Sparse
Representation and Unsupervised Learning. In 2021 International Solid Freeform
Fabrication Symposium. University of Texas at Austin.
[91] Snell, R., Tammas-Williams, S., Chechik, L., Lyle, A., Hernández-Nava, E., Boig,
C., Panoutsos, G. and Todd, I., 2020. Methods for rapid pore classification in metal
additive manufacturing. JOM, 72(1), pp.101-109.
[92] Khanzadeh, M., Bian, L., Shamsaei, N. and Thompson, S.M., 2016. Porosity
detection of laser based additive manufacturing using melt pool morphology clustering.
In 2016 International Solid Freeform Fabrication Symposium. University of Texas at
Austin.
[93] S. Asiri, ‘Machine Learning Classifiers’, Towards Data Science.com,
https://towardsdatascience.com/machine-learning-classifiers-a5cc4e1b0623,
Accessed on 13/05/2022
[94] J. Brownlee, ‘4 Types of Classification Tasks in Machine Learning’, ‘Machine
Learning Mastery’, https://machinelearningmastery.com/types-of-classification-in-
machine-learning/, Accessed on 13/05/2022
[95] M. Khanzadeh, S. Chowdhury, M. Marufuzzaman, M.A. Tschopp, and L. Bian, J.
Manuf. Syst. 47, 69 (2018).
[96] Murphy, K.P., 2018. Machine learning: A probabilistic perspective (adaptive
computation and machine learning series).
[97] Taieb, S.B. and Hyndman, R.J., 2014. A gradient boosting approach to the Kaggle
load forecasting competition. International journal of forecasting, 30(2), pp.382-394.
[98] M. Khanzadeh, S. Chowdhury, M. Marufuzzaman, M. A. Tschopp, and L. Bian, J.
Manuf. Syst.,vol. 47, 69 (2018).
[99] J. Mazumder, Proc. CIRP,vol. 36, 187 (2015).
[100] M. S. Tootooni, A. Dsouza, R. Donovan, P. K. Rao, Z. J. Kong, and P. Borgesen,
J. Eng. Ind.,vol. 139, 091005 (2017).
[101] Barrios, Juan M, and Pablo E Romero. “Decision Tree Methods for Predicting
Surface Roughness in Fused Deposition Modeling Parts.” Materials (Basel,
Switzerland) vol. 12,16 2574. 12 Aug. 2019, doi:10.3390/ma12162574
[102] L. Scime and J. Beuth, Addit. Manuf.,vol. 24, 273 (2018).
[103] Julie Gheysen, Matthieu Marteleur, Camille van der Rest, Aude Simar, Efficient
optimization methodology for laser powder bed fusion parameters to manufacture
dense and mechanically sound parts validated on AlSi12 alloy, Materials & Design,
Volume 199, 2021, 109433, ISSN 0264-1275,
https://doi.org/10.1016/j.matdes.2020.109433.,
https://www.sciencedirect.com/science/article/pii/S0264127520309692
[104] Parand Akbari, Francis Ogoke, Ning-Yu Kao, Kazem Meidani, Chun-Yu Yeh,
William Lee, Amir Barati Farimani, MeltpoolNet: Melt pool characteristic prediction in
Metal Additive Manufacturing using machine learning, Additive Manufacturing, Volume
55, 2022, 102817, ISSN 2214-8604, https://doi.org/10.1016/j.addma.2022.102817.,
https://www.sciencedirect.com/science/article/pii/S2214860422002172
[105] Z. Biczó, I. Felde and S. Szénási, "Distorsion Prediction of Additive
Manufacturing Process using Machine Learning Methods," 2021 IEEE 15th
International Symposium on Applied Computational Intelligence and Informatics
(SACI), 2021, pp. 000249-000252, doi: 10.1109/SACI51354.2021.9465625.
[106] X. Yao, S.K. Moon, and G. Bi, Rapid Prototyping J. 23, 983 (2017).
[107] Y. Zhang, G.S. Hong, D. Ye, K. Zhu, and J.Y. Fuh, Mater. Des. 156, 458 (2018).
[108] J. Mazumder, Proc. CIRP 36, 187 (2015).
[109] AP&C, “Ti-6Al-4V Material Certificate.”
[110] RenAM 500M, PTSC Tooling & Metrology, https://ptscmetrology.com/renam-
500m/, Accessed on 23/05/2022
[111] InfiniAM Spectral Schematic, Renishaw,
https://www.renishaw.com/resourcecentre/en/details/--99581?lang=en, Accessed on
23/05/2022
[112] Phoenix Nanotom M 180 kV / 20 W X-ray nanoCT® system for high-resolution
analysis and 3D metrology, Cincinnati, Ohio, 2021
[113] VGStudio MAX 2.2 Reference Manual, Volume Graphics, Heidelberg, Germany,
2012
[114] VGStudio MAX 2.2 Reference Manual, Volume Graphics, Heidelberg, Germany,
2012
[115] Pedregosa, F., Varoquaux, Ga”el, Gramfort, A., Michel, V., Thirion, B., Grisel,
O., others. (2011). Scikit-learn: Machine learning in Python. Journal of Machine
Learning Research, 12(Oct), 28252830.
[116] Rasband, W.S., ImageJ, U. S. National Institutes of Health, Bethesda, Maryland,
USA, https://imagej.nih.gov/ij/, 1997-2018
[117] Sarah J. Wolff, Hui Wang, Benjamin Gould, Niranjan Parab, Ziheng Wu, Cang
Zhao, Aaron Greco, Tao Sun, In situ X-ray imaging of pore formation mechanisms and
dynamics in laser powder-blown directed energy deposition additive manufacturing,
International Journal of Machine Tools and Manufacture, Volume
166,2021,103743,ISSN,0890-6955,
https://doi.org/10.1016/j.ijmachtools.2021.103743.,
https://www.sciencedirect.com/science/article/pii/S0890695521000523
[118] Anton du Plessis, Philip Sperling, Andre Beerlink, Lerato Tshabalala, Shaik
Hoosain, Ntombi Mathe, Stephan G. le Roux, Standard method for microCT-based
additive manufacturing quality control 1: Porosity analysis, MethodsX, Volume 5, 2018,
Pages 1102-1110, ISSN 2215-0161, https://doi.org/10.1016/j.mex.2018.09.005.,
https://www.sciencedirect.com/science/article/pii/S221501611830147X
[119] Haghdadi, N., Laleh, M., Moyle, M. and Primig, S., 2021. Additive manufacturing
of steels: a review of achievements and challenges. Journal of Materials
Science, 56(1), pp.64-107.
[120] Furumoto, T., Koizumi, A., Alkahari, M.R., Anayama, R., Hosokawa, A., Tanaka,
R. and Ueda, T., 2015. Permeability and strength of a porous metal structure
fabricated by additive manufacturing. Journal of Materials Processing
Technology, 219, pp.10-16.
[121] He, X., Cai, D., Shao, Y., Bao, H. and Han, J., 2010. Laplacian regularized
gaussian mixture model for data clustering. IEEE Transactions on Knowledge and
Data Engineering, 23(9), pp.1406-1418.
[122] Cluster Analysis, NVIDA, Accessed on 28/06/2022, https://www.nvidia.com/en-
us/glossary/data-
science/clustering/#:~:text=Clustering%20is%20used%20to%20identify,databases%
2C%20among%20many%20other%20places.
[123] Fraley, C., Raftery, A.E., Murphy, T.B. and Scrucca, L., 2012. mclust version 4
for R: normal mixture modeling for model-based clustering, classification, and density
estimation (Vol. 597, p. 1). Technical report.
[124] GMM Covariances, Scikitlearn, https://scikit-
learn.org/stable/auto_examples/mixture/plot_gmm_covariances.html, Accessed on
23/05/2022
[125] Murtagh, F. and Contreras, P., 2012. Algorithms for hierarchical clustering: an
overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge
Discovery, 2(1), pp.86-97.
[126] Syarif, I., Prugel-Bennett, A., Wills, G. (2012). Unsupervised Clustering
Approach for Network Anomaly Detection. In: Benlamri, R. (eds) Networked Digital
Technologies. NDT 2012. Communications in Computer and Information Science, vol
293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/9783-642-30507-8_13
[127] T. Bock, What is a dendrogram?, DISPLAYR Blog,
https://www.displayr.com/what-is-dendrogram/, Accessed on 23/05/2022
[128] Shi, C., Wei, B., Wei, S. et al. A quantitative discriminant method of elbow point
for the optimal number of clusters in clustering algorithm. J Wireless Com
Network 2021, 31 (2021). https://doi.org/10.1186/s13638-021-01910-w
[129] Gaussian Mixture Model Selection, Scikitlearn, https://scikit-
learn.org/stable/auto_examples/mixture/plot_gmm_selection.html, Accessed on
25/05/2022
[130] Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics
6(2), pp. 461-464.
[131] A. Bhardwaj, Silhouette Coefficient, Towards Data Science,
https://towardsdatascience.com/silhouette-coefficient-validating-clustering-
techniques-
e976bb81d10c#:~:text=Silhouette%20Coefficient%20or%20silhouette%20score%20i
s%20a%20metric%20used%20to,each%20other%20and%20clearly%20distinguishe
d, Accessed on 25/05/2022
[132] R Patro, Cross-Validation: K-Fold vs Monte Carlo,
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-
e54df2fc179b, Accessed on 25/05/2022
[133] S. Tandale, Importance of Confusion Matrix in Machine Learning and
Cybersecurity, https://shravantandale456.medium.com/importance-of-confusion-
matrix-in-machine-learning-and-cybersecurity-80e67f5858fb, Accessed on
25/05/2022
[134] How to Calculate Precision, Recall, and F-Measure for Imbalanced Classification, J.
Brownlee, Machine Learning Master. Accessed on 30/06/2022,
https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-
classification/#:~:text=Precision%20is%20a%20metric%20that,positive%20examples
%20that%20were%20predicted.
[135] Zhang, W., Jiang, H., Yang, Z., Yamakawa, S., Shimada, K. and Kara, L.B.,
2019. Data-driven upsampling of point clouds. Computer-Aided Design, 112, pp.1-13.